Setting your project on the right track by writing a concept paper

You want to run a research project. You have gone through my checklist, you know what you want to do and why you want to do it. Now comes the time to bring it all together in the form of an output that can be used to evaluate your idea and provide a starting point for the whole thing, whether it is an undergraduate project, a masters thesis, or part of your PhD. This is the goal of a concept paper. You can think of a concept paper as a preregistration of your research project. I will try to give a general outline of what I would expect to see in a concept paper. However, those element might vary depending on the person you are asking so if you search for concept paper online do not be surprised if you see a different structure every time.

If you have not run through the checklist for identifying good research questions, you should start with the checklist to make sure that you can actually answer those questions.

0. The Checklist

Making sure your idea is not terrible

If you want to propose your own idea, which you think fits in my overall themes (or is just so cool you might just convert me), I am giving you an easy 7 step (1 or 2 more if you want to do something more complex) format to structure your idea into a decent project. This is not the only way to do things well, but it is a way that does things well consistently, and therefore I am more comfortable in sharing it. The goal is to structure your idea in a way that helps you plan your actual project. At the start you should be confident in your understanding of point 1 and 2, and work your way into answering the rest of the questions. I would expect you to have a good idea for all steps by the first few weeks of the project at the latest.

Research-focused project

Step 1. There is a problem in the world: [what do you want to contributed toward solving?]
Step 2. That problem is important because: [why do you want to contribute toward solving it?]
Step 3. A cursory search shows me other people who tried to fix this problem: [basic literature search]
Step 4. A better way to solve it would be to do: [what you are proposing]
Step 5. This raises the following research questions: [what are the questions you are trying to answer?]
Step 6. I will solve those research questions by performing the following experiments: [plan of work]
Step 7. I will validate my hypothesis based on the following baselines: [competing approaches]

Engineering-focused project

Step 1. There is a problem in the world: [what do you want to contributed toward solving?]
Step 2. That problem is important because: [why do you want to contribute toward solving it?]
Step 3. A cursory search shows me other people who tried to fix this problem: [basic literature search]
Step 4. A better way to solve it would be to do: [what you are proposing]
Step 5. this raises the following feature requirements: [what does your software need to succeed?]
Step 6. I will solve those requirements by implementing the following software: [plan of work]
Step 7. I will validate my requirements as follows: [how you plan to user-test your software]

Hybrid project

Step 1. There is a problem in the world: [what do you want to contributed toward solving?]
Step 2. That problem is important because: [why do you want to contribute toward solving it?]
Step 3. A cursory search shows me other people who tried to fix this problem: [basic literature search]
Step 4. A better way to solve it would be to do: [what you are proposing]
Step 5. This raises the following research questions: [what are the questions you are trying to answer?]
Step 6. I will solve those research questions by performing the following experiments: [plan of work]
Step 7. In order to perform those experiments, I need to build a piece of software with the following requirements: [what does your software need to succeed?]
Step 8a. I will evaluate my software based on the following methodology: [how you plan to user-test your software]
Step 8b. I will validate my hypothesis based on the following baselines: [competing approaches]

Sanity-checking your research questions

In my teaching, I usually refer to a set of characteristics of a good research question which I remember using the SPAIN mnemonic, which stands for:

  • Specific: you need to refer to specific quantities and how they relate to each other. “Is algorithm X better than algorithm Y?” is a bad question because it uses undefined quantities (what is better? is it faster? more accurate?), an undefined context (for which task? sentiment analysis? argument mining? general classification?), and does not have a real criterion for answering (how much better is better? is 0.00001 better really better?).
  • Plausible: the thing you are investigating needs to be plausible. “Does coding on a black keyboard help your machine learning algorithm work better?” is a bad research question, because there is no plausible mechanism for it and therefore regardless of the result it would be a waste of time.
  • Answerable: can you actually answer this question? Do you have access to relevant data? Enough of it? Do you have access to compute to run the models? Do you have enough to run all your models in time for you to write up your project?
  • Interesting (or alternatively Impactful): this is a tricky one, but the question needs to be interesting. What is the impact of answering this question? Does knowing the answer help researchers?
  • Novel: this one is the most misunderstood by students, because the expectation of novelty is radically different for an undergraduate student, a postgraduate taught student, and a postgraduate research student. A PhD student is expected to output grade A novelty work, meaning that it is novel in an impactful way. UG and PGT students are expected to output grade B novelty work, meaning that it just needs to not be a carbon copy of an existing project. It does not need to change the landscape of research forever (although it can be excellent and publishable work), only to not be a simple copy of some Kaggle notebook you found somewhere.

1. Problem Statement

Begin by meticulously defining the problem, issue, or gap in knowledge that your concept intends to address. Why is this problem worth solving, and what are the potential consequences of leaving it unaddressed? Try to support your problem statement with data or references to previous studies, if relevant. Consider the broader context – what is the societal, environmental, or theoretical significance of addressing this problem?


2. Mission and Impact

This is a concise statement summarising the overall purpose of your project. It should answer the question, “What is this project meant to achieve?”. A strong mission statement might be, “This project aims to develop a novel intervention strategy for reducing school absenteeism within underserved communities.” You should also outline the anticipated significance and outcomes of your work. What contributions will this project make to its field? How could the results advance existing knowledge, change current practices, or benefit society? Be ambitious but realistic in this section. Your project might aim to increase understanding of a social issue, inform policy development, provide recommendations for practitioners, or provide a framework for future research.


3. Aims and Objectives

In this section, you’ll delve into the specific goals of your research. What questions are you aiming to answer? What hypotheses do you intend to test? Frame your research aims in a specific and measurable way. Ensure these aims directly align with both your problem statement and the project’s overall mission. This is also where you should think about the quality of your research questions. I previously wrote and taught about the characteristics of a good research question, which I use the SPAIN mnemonic to remember:

  • Specific: you need to refer to specific quantities and how they relate to each other. “Is algorithm X better than algorithm Y?” is a bad question because it uses undefined quantities (what is better? is it faster? more accurate?), an undefined context (for which task? sentiment analysis? argument mining? general classification?), and does not have a real criterion for answering (how much better is better? is 0.00001 better really better?).
  • Plausible: the thing you are investigating needs to be plausible. “Does coding on a black keyboard help your machine learning algorithm work better?” is a bad research question, because there is no plausible mechanism for it and therefore regardless of the result it would be a waste of time.
  • Answerable: can you actually answer this question? Do you have access to relevant data? Enough of it? Do you have access to compute to run the models? Do you have enough to run all your models in time for you to write up your project?
  • Interesting (or alternatively Impactful): this is a tricky one, but the question needs to be interesting. What is the impact of answering this question? Does knowing the answer help researchers?
  • Novel: this one is the most misunderstood by students, because the expectation of novelty is radically different for an undergraduate student, a postgraduate taught student, and a postgraduate research student. A PhD student is expected to output grade A novelty work, meaning that it is novel in an impactful way. UG and PGT students are expected to output grade B novelty work, meaning that it just needs to not be a carbon copy of an existing project. It does not need to change the landscape of research forever (although it can be excellent and publishable work), only to not be a simple copy of some Kaggle notebook you found somewhere.


4. Methodology

Describe the research methods and strategies you propose to use. Will you be using primary data collection (such as surveys, interviews, or focus groups), quantitative analysis, qualitative studies, or a mixed-methods approach? Explain the reasoning behind your choices – why are these methods the best way to achieve your research aims? Additionally, consider how you will ensure the rigor of your chosen methodology.

5. Time and Resources

One key element of the feasibility of your project will be whether you actually have what you need – time, resources, people, budget – to do it. Those things interact with each other (some projects could be done with more time and less budget, or less time and more budget) and so they should be considered together.

(a) Timeline

Develop a realistic and achievable schedule that outlines the key phases of your project, including expected milestones. Consider factors such as data collection periods, analysis timelines, and dissemination activities when constructing your timeline. Be mindful of potential delays and build in buffer periods where necessary.

(b) Resources

Outline the essential resources required for the successful completion of your project. This could encompass:

Budget: Provide a detailed breakdown of anticipated costs for various project components, such as equipment, participant incentives, travel, software licenses, and publication fees. Be as specific as possible to demonstrate financial responsibility.
Compute: If your project involves complex data analysis or simulations, mention any computational resources you’ll need. This could include access to high-performance computing clusters, specialised software, or cloud computing platforms.
People: Identify the required personnel, collaborators, or external partners needed for your project. Briefly describe their roles and expertise. Consider the time commitment needed from each team member and ensure you have the necessary personnel in place to conduct the research effectively. In the case of a student project, keep in mind that you will be the sole author of everything. However, there might be exceptions to this (e.g., group projects).
Training: Usually, defining a project would require you to be competent enough in the field to run it. However that might not always be the case. For example, maybe you identified an interesting research gap on the automated analysis of fNIRS data using some new flavour of neural networks that you have good reasons to think might work, but you need special training to learn how to implement them on our GPU cluster using PyTorch. Or maybe you know you need to run a user study to evaluate your work, but you do not know how to do so or analyse its results.