GPT-3 for Science, Excel Domination, LLMs as new a UI
Hey everyone! I’m back to publishing after taking a break with the newsletter. It’s been a busy past month with my startup NLP Labs. We had to pivot and in a future edition I’ll explain why, but I’m optimistic about the new direction. (Shameless plug: if you need lots of text data labeled or categorized fast we might be able to help)
But I’m excited to get back to writing frequently about the topics I care about.
Here are 3 of the most interesting things I came across this past week
🧬 How to Build a GPT-3 for Science
Imagine if you could get simple but correct answers to scientific questions. Imagine if it was 100x easier to read scientific papers. This can become a reality, but in the article John Nicholson points out that LLMs haven’t had a major impact in the world of science because research papers are gated by publishers.
Research papers contain everything from references to other papers, key takeaways from experiments, and statistical relationships. Breaking papers down to these components could make it possible to train LLMs to answer questions such as “Tell me why this hypothesis is wrong?” and “What evidence is there to support idea X?”
Josh Nicholson’s company, Scite, is trying to make the above reality and has collected over 1B citation statements from 30+ million full-text articles! I love it when passionate people go through all the schlep work to build incredible useful datasets. Scite used the dataset to build a machine learning model that can go through each citation in a paper and determine whether the citing paper provides supporting or contrasting evidence to the cited paper’s findings. This is so neat because it’s a way to quickly assess the credibility of a finding in a research paper.
Excel Never Dies
Packy McCormick describes Excel as “Lindy Software” because it’s been around for a long time and is very likely to continue to be around for a long time. Some reasons for this are:
Flexibility - Excel can handle a combinatorial explosion number of use cases without 1) knowing what people want to build on top of it beforehand and 2) being user-friendly
Network Effects - Users build functionality and extension on top of it which attracts and retains more users. Excel can handle more features without ruining its core experience
Passionate Users - Ever met an investment banker and not had the words “Excel” and “keyboard shortcuts” brought up?
You can even make the argument that most B2B SaaS companies, especially industry-specific CRMs (i.e. Salesforce), exist because they overcome one specific shortcoming of doing the workflow Excel.
The Unbundling of Excel.
However, if LLMs increase the usability of Excel drastically and help overcome its shortcomings I do wonder if it will only strengthen Excel’s moat and cause more users to opt-out of vertical software.
LLMs as new a UI
We’re all going to be automated out of our jobs! Just kidding. But seriously this demo by Adept AI is incredible. They trained a model to perform software tasks and this thread I linked below goes through some compelling use cases. Just above in the newsletter, I mentioned that LLMs can augment the experience of spreadsheets.
I’m very curious to know if there’s been a paper published and what the training dataset looks like for this task. My initial guess is that the model breaks down each input command and extracts the intent depending on the software it’s running against.