Often I get emails from people who are curious about my work. Sometimes they are journalists who want to do more data analysis, sometimes they are data geeks who want to do more journalism. Here are some common things I get asked.
How did you learn how to code?
It was there that I was given the building blocks for both computer science and journalistic storytelling. To demonstrate how little code I knew, here is a project I made at journalism school.
After that, it was almost a year of very little sleep, and a lot of projects as I taught myself data and code while interning at journalistic outlets. At one internship I made a very simple interactive showing diversity in American newsrooms, at another I developed and designed their homicides tracker that the newspaper continues to update for several years after my stint. After a full year of dabbling with some projects here and there, I published my first D3.js project.
How should a newbie start doing data journalism?
Find a project and teach yourself everything you need to do inorder to reach that end goal. Take small steps.
For reporters, I suggest starting with Excel. You don’t have to visualise straightaway. Even if you are able to add a new insight or a sentence or a paragraph to a story because of Excel, that would have otherwise not been there, it is a good start!
After you hit the wall with point and click tools, start learning how to code. Again, same logic, pick a project and learn small bits you need for that project. I find this project driven approach much more affective than one where one is continuously doing tutorials without applying the knowledge in a field relevant to them.
Is it important to learn to code?
Absolutely not. Lots of amazing data journalists and graphics journalists I know do not code at all.
Code literacy definitely helps. Cause if you know what is possible to can get help from other people with those skills as well. There is a LOT you can do with Excel when it comes to analysis. Similarly, there is a lot you can do within Illustrator and GUI tools for visualisation.
What tools do you use?
The list is huge and it is mostly a factor of convenience and personal preference rather than the efficacy of a tool. I can’t tell you which tool you should use, but here is a list of all that I use often.
- Scraping: node.js, Github Actions
- Mapping/Geospatial Analysis: QGIS, D3.js, GDAL, mapshaper, turf.js
- Data cleaning: Openrefine, node.js, R.
- Data analysis: node.js, R, Excel, pandas.
- Data visualisation: Datawrapper, HTML/CSS, Canvas, D3.js, ggplot2, Adobe Illustrator, ai2html
- Video: Adobe After Effects, Adobe Premiere Pro
I find D3.js overwhelming, is there some place else that one can start?
When I started, I first built some interactives purely with underscore (helps in manipulating data on the page) and jQuery (helps in manipulating design elements on the page). When I got super comfortable with that, I started making things in D3. If you find D3 overwhelming, don’t be disappointed. It does have a steep learning curve.
Examples of things I have built with just jQuery and underscore:
- What choices are you making when you buy mobile Internet in India?
- What’s the real cost of your student loan?
- Which newly elected MLA are you most like?
Where do you find data stories?
In your curiosity! So much of what I do is driven my questions. I was watching Anupama Chopra’s interview with some female singers and there was this anecdote of how every other song is now sung by Arijit Singh while that was not the case earlier. In the past, Lata mangeshkar and Asha Bhosle dominated these albums. My curiosity drove me to see if this could be proven quantitatively. And that is what drove me to do this piece.
So many stories are like that. You read something, you watch something, you live something in your daily life and are posed with a question. Data stories are similar to just regular stories. You see a question and try and answer it with data. Sometimes the data to answer that question exists, sometimes it doesn’t.