Disco Data Delivers

We've been taking a look at the data from our first test run and there's some pretty interesting stuff!

Good morning!  Preliminary and very exciting results are in from our ‘private trial’ of the platform, conducted at the NIH Intramural Campus and Brown University. At time of writing, 128 scientists have now joined the ‘Engine, and have provided 498 unique ratings. Great credit goes to ‘Engineer Dmitrijs Celinskis for mapping DOIs from the rated papers to Altmetric scores for corresponding papers, and starting to play with the results.

The upshot:  Our initial analyses indicate Discovery Value can identify unique subsets of papers. The figure shows the quartile distribution of Discovery Value ratings for our first 400 ratings given, and their mean Altmetric scores.  For the first 3 quartiles, increasing Discovery Value predicts increasing Altmetric scores, with a distinct peak in the 3rd quartile.  


This 3rd quartile is, by definition, those papers judged as making substantial but not elite Discovery contributions. This makes sense:  Papers that are changing minds about a topic should get more attention, a quantity that is very well measured by our colleagues at Altmetric.

In contrast, though, the group of papers with highest Discovery Value (the 4th quartile) showed lower Altmetric scores. These data suggest that public attention is most prominently awarded to findings that change opinions, but do so in understandable ways, potentially living within current paradigmatic structures. The most transformative papers, those that potentially threaten paradigms (and, by extension, the research programs of those reading the papers) received less public attention. This lack of attention could reflect, of course, not even knowing how to mention the more revolutionary work, it could also reflect another signature of the most transformative findings—a lack of Confidence in them until they are confirmed by replication in other labs.  Whatever the reasons, DV is, in fact, doing what we hoped—finding groups of papers that go beyond what public opinion (expressed in citations or Altmetrics) can show.

While this is all early days—have I used the words preliminary and initial enough?!—we are further encouraged that a different relationship was observed between Altmetric scores and ratings of Actionability—the rated utility of findings in a paper.  In this case, the utility of a paper had a mostly flat relationship with Altmetric scores, with one notable exception—the least useful papers are those that received the least attention.  Also, no quartile of Actionability data has a very profound relationship to Altmetric scores:  Basically, there isn’t a peak, nothing that comes close to the high scored seen in the 3rd quartile of DV.

So, in sum:  DV and Actionability have different relationships to Altmetric numbers, underscoring the independent of these two categories we are having people rate.  And, maybe (just maybe?) we are really able to find those papers that are making a change ahead of the curve of broader appreciation.  Lets see how the next 1000 ratings go as we march onward through the public beta, but its exciting times in any event.

Dawn of Disco

Greetings!  I’m told blog posts are supposed to be entertaining, personal, brief and casual…I’ll try…seeing as this is the first, I’ll provide an informal Discontext, givein background for how this all started.

The back-story is exciting to think about given where we are now. We started working on the ‘Engine 3 years ago.  At the time, I was trying to figure out how to help science above and beyond my ‘day’ job (being a Neuroscientist at Brown). I figured I would start by thinking about why I love that day job so much, and Discovery was the answer (note to self:  remember that slogan for a t-shirt…).  

Finding out new things about the brain, being in the recording room late at night when a neuron does something totally unique, or when your hacked code reveals a finding you never considered, is a fantastic moment. My addiction to that feeling is, when I’m honest with myself, a big part of what keeps me energized.

So, ‘make Discovery better’ became a bit of a mantra.  Quickly (ten or so minutes later), I realized that you can’t improve something you can’t define, so I’d better figure out in a systematic way what Discovery was if I wanted to make its pie higher.  

A few days later, I realized I’d always had an intuitive definition:  Whenever someone in lab asked how to choose a project, I’d answered “make findings that change the way we think about the brain.”  I’d then quickly point out that this goal is only one among many worthy goals that also motivate me (helping save lives, improve basic understanding of the models we work on, etc). While these are also big influences, if you are faced with a choice, err on the side of trying to make a difference in understanding.

This priority was a very easy sell.  Many of the best graduate students and postdocs join science to make this kind of difference, to make an apple-lands-on-head kind of contribution.  Not unlike artists (or, at least my stereotypical view of The Artist) most students join the cause out of a deep passion for learning new things about how the world works, and have a deep desire to share that learning with others. Wealth accumulation is not part of the motivation, and they typically are not naïve as to the radical commitment it takes to be great at science.

Ok, great, now I had a definition for Discovery, and I knew that making this process better would be a help to the most important people in scientific research (i.e., students). But that definition was a word salad: A great phrase, but it wasn't clear yet how it provided a path forward.  Around this time, like always, I was having great conversations with Carl Moore (yes, a relation—my father).  He’s arguably the smartest person I know (endstop). And is certainly the smartest I know at helping people bring ideas into the world. We spent a lot of time and email bits thinking about what Discovery in science was, how it related and differed from other enterprises. He made the key suggestion that if this idea was going to be helpful, I should think about how to make it something that a large group could contribute, to make sure the barrier of entry was really low.  He had just read about a film festival in Cuba where the only rule was that the film makers could not have spent more than a set amount (low 5 figures as I remember it) on making their film, which ensured participation of those being the most entrepreneurial, who are usually the ones closest to understanding what it innovative.  This made a lot of sense, albeit in a vague way, so somehow figuring out what Discovery was. I’ll note he has continued to be a great help, guiding our first group brainstorming session and then the first Board meeting…I chose well in who I got born to…this is getting personal enough for a blog, right?

A week or so later, as I was reading a book on the history of Bayes rule, it hit me.  I realized that my guidepost phrase was just a re-statement of Bayesian information.  I’m a little embarrassed it took that long to grok this, as it’s obvious, but I guess it highlights how intuitive my commitment was (I’m saying that to make myself feel better…) 

At this time, I was also talking a lot with my good friend John Armstrong (co-founder of Disco) about life, the universe, etc., Amongst all that we started having great conversations about Discovery, Bayes, and changing the world.  At this time I was also talking a lot to a particularly bright Masters student in my lab, Naveed Jooma (the third co-founder of Disco—sensing a theme?).  After we started talking about Disco, he quickly switched his Masters research from arcane (but very cool) studies of blood flow in the brain to studies of what we now call Discometry—the systematic study of Discovery.  His thesis research—and everything he has done since, which is close to everything in making Disco a functioning platform—was a huge step forward.  He’s actually the one that coined the ‘DiscoveryEngine’ name.   

OK, back to changing the world.  Another key input at the time is one that has become disturbingly omnipresent, the recently growing buzz of disillusionment among more advanced students.  There are a lot of factors turning up the volume on this buzz—many analyses have suggested that the shortage of tenure-track jobs is a major soul-crusher. Another factor is that Science is Hard: Great ideas typically take years to test, and can fail because, well, the truth doesn’t choose to conform. 

But…among the better students, these factors were present but seldom what seemed the most upsetting. They knew the road would be hard. What seemed to be the most important factor in deflating their science enthusiasm was loss of faith in the institution. Politics around publishing in the single-name journals, grant-obsession ad naseum, and worrying about things like whether one got cited in a given paper seemed to them to be the only things PIs worried about. When their PIs were so compulsively worried about these factors—because all the rewards in the business seemed to depend on them so much—there didn’t seem to be much room for sitting under the apple tree and groking elliptical motion: None of these concerns are remotely related to why students are willing to sacrifice for being in science.  

Particularly troubling (to them, and to me) was the view they clearly expressed that science was just all about politics among a small group of elites, those that were gate-keepers of publication and grants. Spending time to break into that group felt like a long, unpleasant proposition, and then having so much decided based on the opinions of just a few people seemed backwards.

As all of these ideas about Discovery were gelling, the most important emulsifier was this disillusionment among students.  If we couldn’t keep the best students in the cause, then the whole thing seems like it will fall apart. 

But now we had a definition for Discovery, meaning we could measure it.  And if we can measure it, we can reward it. We could have meta-journals, reporting the best Discovery papers of the month, we could even give out rewards to individuals maybe—not the ‘wait a lifetime and maybe get called to Stockholm’ awards, but far more frequent, real-time, immediate and distributed. The idea that working on Discovery could help re-orient the rewards in science, aligning it with the values of those that are doing the hardest work and that are the most important (again, IMO), this felt like something really worth our time.  

These ideas got mixed vigorously with another topic I was geeking out about around then, the wisdom of crowds (I swear I have hobbies besides reading history of science, swearing is one among them actually, as John will attest as shaking his head slowly…).  Without going into it, there’s good reason to believe that if a lot of independent individuals with even just a little background in a topic could give their perspectives on Discovery, this user-respecting data might provide insight into emerging new directions in science—as we like to put it at the ‘Engine, we might be able to ‘sense pressure on the wall of the paradigm.’  Predicting the future on top of changing science for the better would, we reasoned, also be a good outcome…(I’d put an emoticon here if I used them).

So, we set out on trying to figure out how to best make these ideas serve science.   We engaged in a lengthy period of development on the way to forming the DiscoveryEngine that is now in ‘public beta’.  One important step forward was digging in deeper into the history of similar ideas: A key highlight was when Jamie Pospishil found a fantastic set of papers showing that the framework we were creating around Discovery Value, and hoping to use to determine the patterns of innovation in science, were also used in almost an identical format in computer science/A.I., in that case for determining the value of data bases.  This question about data has the same requirement of being user-oriented as we have at the ‘Engine (the conviction that something only has ‘Interestingness’ if you consider the interpreter of the information).

As part of this development, we’ve also talked with a lot of people we hope the Engine will help, trying to learn how this new metric and our approach might be best constructed to help them. Likely we have gathered feedback now from at least a few hundred postdocs and graduate students across a variety of contexts. Our most sustained conversations have come with students and more senior scientists at the NIH, where we have been generously allowed to run our first initial ‘private’ trial (much thanks to Anto Bonci!).  We’ve also taken a deep dive in talking with publishers, as we think this information about articles could be a big win for them as they think about impact (e.g., the Science editorial board, editors at NPG, the New York Times Science section, MIT Press).  We’ve also had great talks with a co-founder of Zotero, and with many people in leadership in professional societies (not surprisingly at the Society for Neuroscience, perhaps more surprisingly at the American Chemical Society). A real high point in this trajectory has been our recent conversations and emerging cooperative arrangements with Altmetric, who we view as a fantastic complementary enterprise.

Maybe our most important sustained conversations have come with our remarkable Board of Advisors - humbling how much energy and time this esteemed group has put into helping think about our effort, to really make it their own in a variety of ways.  This all sounds mushy, but who cares—when people show up like they have, it’s important to appreciate. 

All right, this is likely three times as long as a blog post is supposed to be (is there a metric for that?).  Hopefully it gives at least a rambling sense of how we got here, and hopefully the rest of the website gives a sense of where we are going.