Helpful; thanks!<p>"Ten Simple Rules for Reproducible Computational Research" <a href="http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285" rel="nofollow">http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fj...</a><p>* Rule 1: For Every Result, Keep Track of How It Was Produced<p>* Rule 2: Avoid Manual Data Manipulation Steps<p>* Rule 3: Archive the Exact Versions of All External Programs Used<p>* Rule 4: Version Control All Custom Scripts<p>* Rule 5: Record All Intermediate Results, When Possible in Standardized Formats<p>* Rule 6: For Analyses That Include Randomness, Note Underlying Random Seeds<p>* Rule 7: Always Store Raw Data behind Plots<p>* Rule 8: Generate Hierarchical Analysis Output, Allowing Layers of Increasing Detail to Be Inspected<p>* Rule 9: Connect Textual Statements to Underlying Results<p>* Rule 10: Provide Public Access to Scripts, Runs, and Results<p>Sandve GK, Nekrutenko A, Taylor J, Hovig E (2013) Ten Simple Rules for Reproducible Computational Research. PLoS Comput Biol 9(10): e1003285. doi:10.1371/journal.pcbi.1003285
Great post! You mention that it is important to "be skeptical" - I concur and would add that it's helpful to approach the analysis from a non-biased standpoint. Even if you are going into your analysis with certain goals in mind, it is not only more ethical, but also more persuasive, to indicate any inconsistencies in your findings.
I think for "Profile your data", some tools like OpenRefine really help. <a href="http://openrefine.org" rel="nofollow">http://openrefine.org</a>