bgpstuff.net has been far more successful than I could imagine. I initially created it as a side project to learn how to create a website directly with Go. I also used gRPC between the front and backs ends. Since I created the first implementation it’s now served nearly 2 million requests.

Soon after the initial version, I added JSON support. The reason being that I wanted to create a RESTful API for users to query. I don’t have that much experience with REST so my implementation was not great. Various cracks in my code appeared dealing with the sheer amount of queries.

Because of this, during December 2020 I decided to rewrite bgpstuff.net from scratch. Nearly 100% of my code was brand new. The few items I reused were my .css file and the general layout of my templates. This is to ensure that to most users it looks completely the same. Almost everything underneath has changed.

This post will detail some of the more interesting changes.

More resilient

To better understand the architecture, feel free to view my presentation at virtualnog.net. The website itself is merely a front-end to a gRPC glass server. It’s this server that does all the lookups and caching. On the old site, this front-end only spoke to a single backend. The issue here is that certain errors could cause that glass server to restart. In the few seconds that it took to restart, any query would fail. The new version now connects to multiple backends. Each query gets sent to all backends at the same time. The first non-error reply that comes back is returned to the user. This way, if a backend fails, you’ll still get a result, albeit with higher latency.

The following is a code snippet to show how this is done.

	ch := make(chan results, len(m.Stubs))
	for _, s := range m.Stubs {
		go func(stub gpb.LookingGlassClient) {
			res, err := stub.Route(ctx, &gpb.RouteRequest{
				IpAddress: &gpb.IpAddress{
					Address: ip,
				},
			})
			st, _ := status.FromError(err)
			if st.Code() == codes.OK {
				ch <- results{res, err}
			}
		}(s)
	}
	// Grab whichever result comes in first without an unavailable error.
    first := <-ch

m.Stubs contains a slice of all the backend stubs. An attempt to get an answer from each backend is made. The first reply that comes back from any backend with a status of OK is put onto the channel. That reply is then used to reply to the user. The context is also canceled to ensure queries to other backends know they can stop working.

Correct REST URIs

On the old site, sending a request for a specific prefix/asn would use a query URL. Example being https://bgpstuff.net/route?ip=8.8.8.8. The new site will use the correct REST structure like so: https://bgpstuff.net/route/8.8.8.8. Eventually, I’ll also support subnet masks in queries the same way: https://bgpstuff.net/route/8.8.8.0/24

Correct HTTP status codes

Pretty much any query to the old site would give an HTTP 200 OK, regardless if the query was invalid, or if there was a server error itself. The new site will now correctly set these status codes so clients have more information about what went wrong and what to do next.
Old:

$ curl -i https://bgpstuff.net/route?ip=10.0.0.0
HTTP/2 200

New:

$ curl -i https://bgpstuff.net/route/10.0.0.0
HTTP/2 400

JSON output

Previously you could get JSON output via a query parameter. This is no longer the case. In order to get a response with JSON encoding, you’ll need to make the request with a json request header. My eventual clients will do this, but it can also be done in curl like so:

$ curl -H "Content-Type: application/json" https://bgpstuff.net/route/8.8.8.8
{"CurrentYear":0,"Locale":"","IP":"","Timer":"9.864126ms","Response":{"Action":"route","Route":"8.8.8.0/24","ASPath":null,"ASSet":null,"Origin":"15169","ROA":"","ASName":"","ASLocale":"","Invalids":null,"Sourced":{"Ipv4":0,"Ipv6":0,"Prefixes":null},"Location":{"Lat":"","Long":"","City":"","Country":"","Map":""},"Totals":{"Ipv4":0,"Ipv6":0,"Time":0},"IP":"8.8.8.8","Exists":true}}

Any JSON query will return the render struct in JSON format. I’ll be documenting what each field will be. I’ll be documenting the clients I create which will have the required getters in them.

AS-SET now works

Previously, as-path queries would only output the as-path. Now the as-set is also appended when it exists. Old

$ curl https://bgpstuff.net/aspath?ip=2400:5200:402::
The AS path for 2400:5200:402:: is [37100 9498 55410 55410]

New

$ curl https://bgpstuff.net/aspath/2400:5200:402::
The AS path for 2400:5200:402:: is 37100 9498 55410 55410 { 16625 65002 }

No more part rendered pages

If an error during a page render was encountered, this showed up as a silent error in which part of the page was rendered. Also, an HTTP status of 200 OK was seen. This is no longer the case. I now render a page into a buffer. If an error occurs the user is informed via an internal server error. If there is no issue, then that buffer itself is used to render the page. This is what it looks like in code.

	buf := new(bytes.Buffer)

	err := ts.ExecuteTemplate(buf, name, app.addDefaultData(td, r))
	if err != nil {
		app.serverError(w, err)
		return
	}

	// If no errors, dump the buffer to the response writer
    buf.WriteTo(w)

Only TLS 1.2 and 1.3 support

While you won’t see this on the initial page as I’m fronted by Cloudflare, any request from Cloudflare to the site will only have TLS 1.2 and 1.3 supported. Vanilla HTTP has never been supported. On the backend, I also only support the assembly cipher implementations inside Go and nothing else.

Invalids

https://bgpstuff.net/invalids was a hidden option before, but will be accessible from the front page. Going to the generic site link will show all ASNs and their invalids. You can also visit an ASN specific link to see invalids sourced just by that ASN https://bgpstuff.net/invalids/3356

Site ping

https://bgpstuff.net/ping can be used to see if the site is active at any time. It will simply respond with HTTP 200 OK if the site is up.

Numerous backend improvements

Too many to mention, but I’ll highlight a few

  • All requests will add a random string within the context. This context is passed down the chain from handler to backend. If a query breaks, the error should report that value and that should be alerted to me. I can use this value to attempt to work out what happened.
  • Handlers are now wrapped in middleware where I do various things like logging and the like. Previously this was added to each function as repeated code.
  • Tests. Coverage is high but could be better. The main reason for the rewrite was that the way I wrote my previous functions, was difficult to test. I’ve tried to write all functions in a testable way. Not only unit tests, but integration tests.

TODOs

  • Handle subnet masks in various queries
  • DNS that can do location-based forwarding. i.e. allow European users to use a server closer to them. I need a free option to do this.
  • If I have the above, I would then like to add more clients. One in each continent. I need a VM with both IPv4 and IPv6 connectivity and about a gig of RAM.
  • Clients. I’ll be writing an importable Python and Go client library you could import and use to speak directly to the site via REST. Maybe some other languages eventually.