Non-commercialRepo Lookout is a non-commercial project. We just want to make the web a safer place. You can support us on Ko-fi.
Illustration

Find publicly exposed source code repositories

Repo Lookout is a large-scale security scanner, with a single purpose: Finding source code repositories, that have been accidentally exposed to the public and reporting them to the domain’s technical contact.

Accidentally exposed source code repositories often contain highly sensitive information that can be used for downstream attacks, such as data leaks and extortion via ransomware. While the problem has been known and documented extensively for years,1 2 3 4 our findings show its continued prevalence.

Our goal is to combat this vulnerability by automatically detecting and reporting instances.

URL index

The URL index for the scanning process is obtained from the following two sources:

  • CommonCrawl builds and maintains an open repository of web crawl data.
  • Tranco List is a research-oriented top site ranking hardened against manipulation.

Statistics

Including our last security scan on December 9, 2022 we have checked 2,606,369,488 URLs on 338,444,533 domains.

A total of 459,632 publicly exposed source code repositories have been found so far.

Illustration

Frequently Asked Questions

How to prevent exposed repositories?

It’s recommend to configure the web server in such a way that it denies access to all “dot folders” (i.e. folders starting with a “.”).5 However, to prevent exposing Git respositories it is enough to deny access to “.git” folders.

How this is done exactly depends on the server software used, but here are some configuration examples for nginx, Apache, and Caddy.

How to opt-out of being scanned?

The security scanner was designed to be a good network citizen, i.e. requests are throttled and bandwidth usage is minimal. However, we do understand that not every website might welcome the scanning process.6

There are two ways to opt-out of the scanning process:

  1. Send us an opt-out email with the domain name, IP, IP range(s), or ASN. In case if IP range(s) or ASN we request log-entries caused by a previous scan as means of authentication.

  2. Reject all requests with an HTTP User-Agent prefix of “RepoLookoutBot”.

Illustration

Sponsoring

To support this project, consider becoming a sponsor on Ko-fi. All tips will be used for the crawling and email infrastructure.

Thanks!


  1. How unprotected .git repositories compromise website security (German, 2015)

  2. Source code disclosure via exposed .git folder (English, 2018)

  3. Open .git global scan (English, 2018)

  4. Finding exposed .git repositories (English, 2020)

  5. Except for the “.well-known” folder, which is defined in RFC 8615

  6. E.g. Fail2Ban is sometimes configured to trigger alerts on scanning .git folders