TIL: GitHub Code Search is Underrated

Posted on Nov 5, 2023

I find myself recommending GitHub code search a lot in conversations (most recently on reddit), so here is a short write-up:

GitHub code search allows you to query all public GitHub repos with powerful search syntax. You can use it directly in the search bar on the GitHub starting page with code search syntax or use the advanced search form.

I find code search most useful when I need to find out how to use a specific API from a library I’m not familiar with. Maybe the specific API isn’t well documented and a Google search doesn’t surface tutorials or usage examples. Another use case is when I want to find out conventional uses or best practices. Two examples:

Ex1: Finding out coding conventions

Let’s say I’m unsure about how I should import the Numpy library when importing it in my Python code (bit silly but most simple example I could come up with 🙈). I’d search:

"import numpy as" path:*.py NOT is:fork

Check out the results here!

I’m searching for the whole term and not for import numpy, and as separately.
The path:*.py narrows the search to Python files and I exclude forks from the search to reduce noise. Scanning the first results page, I see that everyone imports Numpy as np. I might glance at the repository names to gauge the credibility and quality of the code.

Ex2: Learning how an API is used

I recently wanted to find out how to use the value_parser functionality to provide my own custom parser to the Rust Clap library. I used the following query:

clap AND value_parser= path:*.rs NOT is:fork

Check out the results here!

This search was just a starting point to check the source code of the returned search results. The API is used in many different ways and I found very elegant code snippets that translated perfectly to my problem.

Why Code Search When You Have Copilot?

The most obvious reason is that I work for many different customers and most don’t allow the use of Copilot. Also, I often find Copilot’s suggestions inappropriate for my code - for what I want to achieve. In theory, Copilot suggests code that fits the context of my code but more often than not it suggests some kind of “average usage” or “how most people use it”. This can be useful, e.g. in Example 1 above. However, looking at the usage of a library’s API in the context of other people’s code is often more useful for me. It helps me to “grok” the library, to get inspired by others’ code, and to memorize the library’s API.