NOAH
JACOBS
HACKING
"Wasting programmer time is the true inefficiency, not wasting machine time."
- Paul Graham
I am a programmer without formal education. I’ve been lucky to work with and learn from people more experienced than myself; I’ve also been lucky in the sense that I am a relatively quick learner.
Age 21 - Took EECs 183, the intro coding class at the University of Michigan | Managed the creation of some programmatic tools for investing–this was not me coding myself, but my forced use of terminal was a gateway drug | Wrapped a google sheet with a script to alert me when it was someone’s bday | Scraped a few sites, like FINRA
Age 22 - More scraping | Used genetic algos to optimize a trading strategy within a monte carlo simulation I made | Wrote scripts to monitor, process, and summarize 8-K’s & earnings calls for publicly traded companies | Contributed to and used some prompt benchmarking suites for A/B testing of content generation
Age 23 - Lots of API wrappers for a sales intelligence platform | Selenium test suite for the same | This site
I haven’t yet added a whole lot of content to the left column yet, but stay tuned for more.
Last Updated 2024.08.25
GRAPH OF MY BLOG POSTS
For a detailed explanation of what's going on, scroll below the graph...
Explanation
One of the reasons I write my blog is to connect seemingly disparate ideas in such a fashion that encourages exponential learning in both myself and the reader. I think a natural way to visualize this is with a knowledge graph, seen below.
So, the above graph is a visualization of each of my blog posts (lil blue dots) and how they relate to "crafts" (bigger blue dots) and "abstractions" (green dots).
The connections were derived by a mix of exact keyword matches aswell as by asking an LLM if it thought the post was related to each craft and abstraction.
After clicking on a node, you can see a summary of the node as well as connected nodes that you can traverse to. Additionally, there is a search bar.
I would classify the search functionality as 'experimental'; when you click on it, the following happens:
-> The query is tokenized using a custom WordPiece tokenizer in Clojure, a natural choice due to the innate recursive nature of the language and it being what my site is written in.
-> The tokens are embedded in your browser using MiniLm via onnxruntime-web. Wild, right?
-> The titles of adjacent nodes are fetched, tokenized, and embedded in your browser.
-> Cosine similarity is run between the query and each of the node embeddings.
-> Given the most similar node, the content of that node and your query are passed into gpt 4o mini to answer your question.
If you can't tell, I'm fascinated by the capacity to run a model on your browser. I could pre embed the titles, but want to show that it really doesn't take that long in your browser.
As a note, when I was experimenting with the gpt wrapper, before I added in my blogs as context, I found out that gpt does have some latent knowledge about my blog, which was quite nice to see. That being said, this may confound some of the apparent info alpha from selecting the context in the way I have. I think it's fair, though, to have the same thing that made some of the connections evaluate them.
Below is a list of some of the things that made it into the current iteration, followed by a list of deadends:
Current Iteration