I had the luck to work with super talented and kind people in a fantastic lab (https://cgmlab.org/) for 7 months during my MSc studies.
I aimed to mine metagenomic sequencing data to find enzyme variants of interest.
I created a sort of pipeline (MetEOR, https://github.com/jacgonisa/MSc_thesis) that was very useful to explore the plastic-degrading enzyme family. I'd love to publish my findings in the future.
You will notice that the pipeline is flawed in many senses and lacks of many good practices in software development (I am not a CS student nor a software engineer, I just did my best to assemble something useful...).
As an exercise (mainly intended to BSc students or beginners in bioinformatics), I think it might be cool to spot the biggest pitfalls of the pipeline :)
I'll start:
- Documentation could be better-elaborated. The format is pretty, but the content leaves much to be desired!
Feel free to be very critic! ;)

Comments
Post a Comment