Archive for the ‘life of student’ Category

With 3 months left til my PhD stipend runs out, I have been thinking about the next stage of my career. Looking at the job ads, the market is a bit different now. It’s good to access my situation and see what the market has to offer.

A lot of my friends who also enrolled in the PhD bioinformatics program have found their next career stage, either staying in the same lab, already working in IB/consultancy, or even going to retire (!) in S. Africa .

I’d like to stay in research. I love research. And I’d love to find a job around Cambridge, where my girlfriend is starting a PhD this Oct. If possible, I’d like to get involved in 1000 genomes or large sequencing projects. It’s an exciting time for (re)sequencing.

What I have (or I think I have) may look good

  • a distinction in MSc Bioinformatics
  • potential 4 first author papers (does it matter? I have no idea) towards the end of PhD
  • used Java extensively in the early days, but as data got messier and larger, I have been using Perl only.
  • good statistical knowledge
  • involved in various genome sequencing projects

What the market* wants (that would be really nice)

  • database design
  • software engineering experience
  • C++ or Java. (To be fair I don’t see much point in this..)
  • experience in Web applications (RoR..etc)
  • have used Bio: modules (BioPerl, BioPython) extensively
  • knowledge in server deployment

Most of my PhD involves sets of 10MB (~multiple copies of yeast genomes) of data that can be read from flat files. 10MB! I thought it was an overkill to construct a database for it. Now I regret a little, as management of database would be a big big plus in what people are looking for.

Most ads mention that you need one ‘scripting language’ like Perl and another language like c++ or Java. Personally I am not so sure about this. The vast amount of data generated in the life sciences mean any language would be too slow. Rather, I would concentrate on Perl (with a slight hope that Perl 6 will take over the world), and will learn Python (or maybe Ruby) thoroughly.

And another thing I have missed out is using the Bio: modules, for example BioPerl. BioPerl is a very powerful package, but most of the things are pretty much ‘standard’. In the end I just write everything I want fromscratch (For those of you used DNAsp before, I have rewritten almost everything in Perl..).

One good thing about bioinformatics jobs are that they are usually very specific. Database? Python? Web application? You name it. And it’s not difficult to learn and prepare. You just have to keep practicing. Reading blogs and friendfeed will give you some ideas about what skills/topics people are interested in.

So, to prepare for the jobs after PhD, I think I would contribute to some of the Bio: modules. I will learn Python thoroughly, brush some of my statistical techniques, and keep throwing CV at people.

And of course, finish writing up my thesis first.

*from jobs.ac.uk , sanger/ebi jobs, evoldir. This may be underrepresented.


Read Full Post »

Like fellow student Michael, I am also going to SMBE in Barcelona this year. I will be presenting a poster (sigh, when will I ever be able to present a talk? 😛 ).

I always loved going to conferences, perhaps even more than anyone. Why? Being in a small family business (boss no 1, boss no 2 aka boss no 1’s wife, and me), this is one of the few chances to meet and shout out my ideas. In my 2.5 years of Phd I have gone to 1 winter school, 5 workshops and 2 conferences. I have met some amazing people, though they may not remember me after all. As long as I remember them and  learnt lots of new ideas, that’s fine!

I would like to share things I learnt from going to these conferences, to help anyone making full use of conferences. First I’d like say that I am an extremely intravert person, so obviously meeting people can be a bit tricky for me. Some of the points below may be obvious to some, but they are all my personal experiences.

  1. Prepare for the conference
    Not just poster/talks (this is more like a must). Know who you want to meet. As a PhD student, I would like to meet some fellow PhD students who would be struggling to write up and are considering the next career stage. Or, you have read someone’s paper and you would like to speak to them about the paper. Or, you encountered some problem implementing someone’s model. Even a potential collaboration. Conference is surprisingly short in terms of meeting people, so be prepared.
  2. Know how to introduce yourself, at the right place and right time
    Now you know who you would like to meet. Find the person and wait for the chance to introduce yourself and ask the question. Be patient. If you want to ask the big guns you would need to join the queue. Trust me, it would be worth it (e.g., saves you much more time he explains to you than you read his paper another twenty times). And be brief when introducing yourself, and jump straight into the question (like “Hi! My name is Jason, and I got a question with regard to xxx”).

    Real case 1:
    I remembered last year some student asks a professor a question in the toilet. It didn’t turn out well..

    Real case 2:
    In my first conference I needed so badly to ask a question to a professor from Oxford, I stalked him throughout the coffee break. He was always with someone, so it would be rude interrupting them. In the end I gave up, but being such a kind person he is (or maybe he was a bit freaked out by me), he actually came to me and asks what I wanted. With his help I was able to use his work to publish my first paper.
  3. Choose the question carefully; don’t ask open-ended questions
    Conference is sort of like speed dating. You find the person who doesn’t have a lot of time for you, ask a question, and get a answer. So don’t ask questions like “what’s the meaning of life?” If you ask an interesting question and click! He would be more likely to ask you questions back! And don’t ask questions like “Can I be a post doc in your lab?”, leave that after the conference.
  4. Don’t be let down by the big results
    This probably doesn’t apply to everyone. I get disappointed by many things, like doing a poster instead of the talk. And sometimes I envy what people have come up in the conferences. These results are always amazing, with an aura radiating around them. And then you start to blame yourself, “%£$%&$%£%$3…”. Well, don’t be. One thing I realised is that you can’t do anything (although being a bioinformatician you are always constantly tempted), learn to appreciate them, and perhaps adapt their theories into your own research.
  5. No big lunch/No overdrinking
    This also gets mentioned in various articles. I know it’s hard.. but I absolutly agree, given I am a big eater. If you eat too much, you won’t have the concentration to sit the rest of the afternoon (no matter how much coffee you drink). I had some embarrassing experiences…
  6. Speak strictly to what you know
    Don’t bullshit or bluff or comment on things you have absolutely no idea of. Or don’t try to relate the conversation to something you already know. You will get your turn. This merely is a personal reflection, as I had some conversations which everything I said, the person would say “ah, this is interesting, but what I did in species x was…”, rather than open discussion of possible ideas.
  7. Take notes and follow up straight after
    Don’t wait. Tidy notes. Start new analysis straight away if you can. Start send emails. Otherwise you will forget in 2 weeks and it would be no point to waste the money to the conference.
  8. Be yourself
    Whether you are intra/extravert, geek/nongeek, fashionable/dull… be yourself. Don’t try so hard pleasing others. Again, people know and you would make them uneasy. Everyone’s interesting in their own way, so just be yourself. If you don’t like being with the people you form group with, go to another one. Or rather, go back and tidy your notes. Make yourself useful if you think you can’t cope with some people.
  9. Enjoy!
    You get to visit a new city. You are meeting people whose papers/textbooks you have read throughout your research. You meet your own peers who are also struggling to work/write thesis/papers. People are friendly (overfriendly I would say) and critise you with no hard feelings. Even sometimes (only sometimes) you get a compliment from someone saying your research is interesting, that would make it two! What’s not to enjoy?

This probably doesn’t apply to everyone, especially if you have already someone in the group to go with. I am glad that this year I actually know quite a few people (yes, even the professor I stalked would be in this conference). Hence, enjoy when you are in a conference!

Read Full Post »

From Phdcomics

Ever so often when I am asked about what exactly I am doing in my Phd, I always think for quite some time as if I had no idea. I start to notice this behaviour increasing popular among my peers as well when they get given the same question. This is not because we don’t know what we are doing, but rather we do not know how to characterise ourselves.

I am currently enrolled as a Phd student in bioinformatics, and this is what I do in a typical day:

What I do (grant version)

Understanding the evolutionary forces that shape the genomic variations between and within different yeast species. A day of work would involve categorising and analysing polymorphism/divergence and estimate parameters that would explain the effectiveness of different evolutionary forces acting on the DNA level.

What I really do:

  1. I code and make tables. I have a DNA sequence alignment since 2005. I have been trying to extract everything from the alignment for 2 years and counting. Whatever you can think of with a DNA alignment, I have done it all (yes, been there, done that). This either involves making the alignment into a SNP dataset, feed into some program someone had published and hope the results come out fine. Or the program no longer works/does not suit slightly/no longer being maintained, and you write yourself one. (Current achievement: writing almost every function of DNAsp with missing data in Perl) In the end you have a big big big table. (time spent: 5 minutes ~ days)
  2. I make graphs. big big big table -> human readable graphs. Depending on how much money your boss have greyscaling the graph can be a real…. (time spent: 5 minutes ~ hours if publishing deadline approaching )
  3. I read papers. This will take ages. Papers with good results often mean you have to dig the real methods from supplementary materials. Reading the papers as if you are the reviewer. (time spent: depending on the level of kindness of the authors)
  4. I interpret the results. This is what/why you work hard for. (time spent: 1 minute)
  5. I re-run everything I have done in the last 2 years, in 30 minutes. Results are fascinating, and you have to confirm it. Sure! Re run everything you have done before, and realising your Phd is in effect, 30 minutes of work if someone has published a good program to calculate everything (time spent: 30 minutes and days questioning yourself)
  6. I write. Results are fine and confirmed. Now I need to make sure everyone (aka supervisor) understand it. (time spent: depends)
  7. I moan. A problem people can realise from the list above is that I have not done anything novel. The results are novel, but all the methods are old. Is that a PhD? I do know for a fact that I am not qualified for most jobs currently advertising, something like the desired candidates would need to design novel statistical tests. (time spent: forever)

(tea breaks/supervior meetings/seminars/daydreaming/blog reading not included)

This is my typical day of work, and I would call myself a genomic data analyst with experiences of dealing large genome datasets and good statistical knowledge and very quick coding hands and a very fundamental understanding of biochemistry/molecular biology and know a bit of everything (note: this is not a self promotion post). This is what we do: we are biologists that have a lot of and in our roles.

And personally I have the integrity to make sure my paper is not just another genome analysis paper that publish summary statistics and mention future directions. I make sure other scientists can repeat my analysis in 30 minutes with my alignment or any well established alignment and see the significance of it.

So what exactly do you do?

Read Full Post »


…. other than the obligatory roles like mentoring their research?

When I first started my PhD I have just obtained a Masters degree in Bioinformatics from a good-rated university in the UK, having learnt a fair amount of Perl, Java, statistics, R and recapped pretty much all of its biology component. I have three supervisors, two geneticists from the lab I am staying currently and another theoretical physicist/mathematicians. The system seemed simple: if I had problems in biology, I’d go for my main supervisor, and if I had problems in mathematics, I’d go for the mathematicians.

It didn’t work out quite that way in the end, as most of the maths problems I encounter would be learning it, and it’s just too trivial to ask an university academic…. so in my personal opinion the system is a total failure if the students themselves are not trained to at least engage a intellctual conversation with their supervisors.

There is no attending lectures or qualification to take during the Phd in the UK system, by assuming all the starting students are talented and can dash in the project in no time! I was lucky that I did the Masters in the first place, though the statistics was new to me, I adapted pretty quickly. Still, the following 3 scenarios below are quite common:

  1. someone who sequenced lots of genes and performed all their analyses in DNAsp and excel
  2. someone performed all his PhD analysis by refreshing web blast searches all his first two years (it’s really true!) and in Excel
  3. someone who is studying prediction of protein-protein interaction networks, who knew little about background of the biological significance of PINs.

In my personal view the geneticists are the “early bioinformaticians” in the biology field, and you can see people from scenario 1 and 2 come from a biology background, and 3 from a CS background. Which situation seem more dreadful? I say 3. Maybe it’s an excuse for me being biology-background-not-as-good-as-the-rest-in-Maths person, but knowing the biological background and its significance is absolutely crucial in your research. Sure, it’s a bit ridiculous of being a bioinformatics Phd knowing little programming, but your lack of programming skills won’t show on your thesis. For CS people, you might get away with that and joke about “oh I know not much about it but I am good in Maths!”, but one day you are going to fall hard. The ending of scenario 3 was bad: this person didn’t know what’s the difference between the self-interaction proteins and subunits-of-one-protein interacting each other (they are all unique entities in one single database) so he made wrong assumptions about nodes with lots of edges in the first place…

Now the main point to focus: do the supervisors need to discipline their students to learn everything associated with the research or specialise? If you are the molecular geneticist, will your supervisor push you to learn Perl/R in your spare work on top of the experimental work? If you are the CS person, will they draw you into the fascinating sides of biology? (As I read through I might be a bit biased :P) For bioinformaticians, I guess you need to learn all (or not). Will it be too generalised and perhaps in a way waste of too much time ?

Another point that I think is just important: the writing style (hence the image above). You have to write in a way that the biologists understand the maths, and the rest understand its biology. Easy for some, though in the case of myself, I often write quite the opposite way. I think this is one of the main roles of your supervisors: to train your writing, whether they match the standards themselves is another issue.

The paradox is here, the Phd is all about specialising. But! naively bioinformaticians = generalists, how can you specialise on being a generalist?

I’ll use myself as an example, as I can’t see myself specialising on anything yet! I like to know a piece of work roughly 90%. If I see reversible jump MCMC, I’ll first start with Bayesian statistics for dummies then go from there. It takes time, but the satisfaction of reading someone’s elegant statistics while appreciating the significance of why (i.e. why you analyse/write/publish in the first place) is immense. This is often encouraged by my supervisors by listening to how I explain the theories. One of my supervisor would spent ages sitting down with me reading my drafts sentence to sentence. It takes usually hours, and often her comments are more than the draft itself, no nonsense!

However, being a generalist means I might not be able to get any jobs once I finish my PhD, I may, disappointingly, not able to fit any of criteria that any lab wants. I will be in effect too underqualified for post doc jobs. So we will see how this will turn out.

Read Full Post »