Repo Lookout is a large-scale security scanner, with a single purpose: Finding source code repositories, that have been accidentally exposed to the public and reporting them to the domain’s technical contact.
Accidentally exposed source code repositories often contain highly sensitive information that can be used for downstream attacks, such as data leaks and extortion via ransomware. While the problem has been known and documented for years1, our findings show its continued prevalence.
Our goal is to combat this vulnerability by automatically detecting and reporting instances.
The URL index for the scanning process is obtained from the following two sources:
- CommonCrawl builds and maintains an open repository of web crawl data.
- Tranco List is a research-oriented top site ranking hardened against manipulation.
Our last security scan on March 24th 2022 checked 890,605,654 URLs on 63,180,936 domains.
A total of 54,844 publicly exposed source code repositories have been found so far.
How to Opt-Out?
We designed the security scanner to be a good network citizen, i.e. requests are throttled and bandwidth usage is minimal. However, we do understand that not every website might welcome the scanning process2.
There are two ways to opt-out of the scanning process:
Send us an opt-out email with the domain name, IP, IP range(s), or ASN. In case if IP range(s) or ASN we request log-entries caused by a previous scan as means of authentication.
Reject all requests with an HTTP User-Agent prefix of “
To support this project, consider becoming a sponsor on Patreon. Thanks!