Module materialize.cli.mock_telemetry_server

A mock telemetry server.

This is a mock of the telemetry server that runs at telemetry.materialize.com. It is intended for manual testing of materialized's telemetry reporting. (Automating this test would be nice, but it is not trivial.)

The following is an example of how you might use this mock telemetry server. First, launch the server:

$ bin/pyactivate -m materialize.cli.mock_telemetry_server

This generates a self-signed SSL certificate named "localhost.crt" in the current directory.

In another terminal, launch Materialize:

$ SSL_CERT_FILE=localhost.crt cargo run -- --dev --telemetry-domain localhost:4000 --telemetry-interval 5s

Notice how we configure Materialize to target the mock telemetry server. The SSL_CERT_FILE environment variable instructs OpenSSL to trust the self-signed certificate the server created. Using a small telemetry reporting interval means you won't be waiting hours to observe multiple turns of the reporting loop.

As the mock telemetry server receives requests, it logs them. You can add sources and sinks to Materialize and verify that the data reported to the mock telemetry server evolves accordingly.

By default, the mock telemetry server claims the latest version is the version specified in the Cargo.toml for the materialized crate. You can update its latest version on the fly with a PUT request:

$ SSL_CERT_FILE=localhost.crt curl <https://localhost:4000> -X PUT --data 9.9.9

Increasing the latest version like so should cause Materialize to log a "new version" notice. You may need to check the log file to see the notice.

Expand source code Browse git
#!/usr/bin/env python3

# Copyright Materialize, Inc. and contributors. All rights reserved.
#
# Use of this software is governed by the Business Source License
# included in the LICENSE file at the root of this repository.
#
# As of the Change Date specified in that file, in accordance with
# the Business Source License, use of this software will be governed
# by the Apache License, Version 2.0.


"""A mock telemetry server.

This is a mock of the telemetry server that runs at telemetry.materialize.com.
It is intended for manual testing of materialized's telemetry reporting.
(Automating this test would be nice, but it is not trivial.)

The following is an example of how you might use this mock telemetry server.
First, launch the server:

    $ bin/pyactivate -m materialize.cli.mock_telemetry_server

This generates a self-signed SSL certificate named "localhost.crt" in the
current directory.

In another terminal, launch Materialize:

    $ SSL_CERT_FILE=localhost.crt cargo run -- --dev --telemetry-domain localhost:4000 --telemetry-interval 5s

Notice how we configure Materialize to target the mock telemetry server. The
SSL_CERT_FILE environment variable instructs OpenSSL to trust the self-signed
certificate the server created. Using a small telemetry reporting interval means
you won't be waiting hours to observe multiple turns of the reporting loop.

As the mock telemetry server receives requests, it logs them. You can add
sources and sinks to Materialize and verify that the data reported to the mock
telemetry server evolves accordingly.

By default, the mock telemetry server claims the latest version is the version
specified in the Cargo.toml for the materialized crate. You can update its
latest version on the fly with a PUT request:

    $ SSL_CERT_FILE=localhost.crt curl https://localhost:4000 -X PUT --data 9.9.9

Increasing the latest version like so should cause Materialize to log a "new
version" notice. You may need to check the log file to see the notice.
"""

import json
import os
import os.path
import ssl
from http.server import BaseHTTPRequestHandler, HTTPServer
from pathlib import Path

from materialize.cargo import Workspace


class Handler(BaseHTTPRequestHandler):
    def read_body(self) -> str:
        content_length = int(self.headers["content-length"])
        return self.rfile.read(content_length).decode("UTF-8")

    def do_PUT(self) -> None:
        global latest_release
        latest_release = self.read_body()
        self.send_response(200)
        self.end_headers()

    def do_POST(self) -> None:
        print(json.loads(self.read_body()))
        return self.do_GET()

    def do_GET(self) -> None:
        self.send_response(200)
        self.end_headers()
        body = json.dumps({"latest_release": latest_release})
        self.wfile.write(body.encode("UTF-8"))


if __name__ == "__main__":
    if os.path.exists("localhost.crt"):
        print("localhost.crt already exists, not regenerating")
    else:
        print("Generating self-signed cert for localhost...")
        os.system(
            "openssl req -nodes -x509 -keyout localhost.crt -out localhost.crt -subj '/CN=localhost'"
        )

    cargo_workspace = Workspace(Path(os.environ["MZ_ROOT"]))
    latest_release = str(cargo_workspace.crate_for_bin("materialized").version)

    httpd = HTTPServer(("localhost", 4000), Handler)
    httpd.socket = ssl.wrap_socket(
        httpd.socket, server_side=True, certfile="localhost.crt"
    )

    print("Listening on localhost:4000...")
    httpd.serve_forever()

Classes

class Handler (request, client_address, server)

HTTP request handler base class.

The following explanation of HTTP serves to guide you through the code as well as to expose any misunderstandings I may have about HTTP (so you don't need to read the code to figure out I'm wrong :-).

HTTP (HyperText Transfer Protocol) is an extensible protocol on top of a reliable stream transport (e.g. TCP/IP). The protocol recognizes three parts to a request:

  1. One line identifying the request type and path
  2. An optional set of RFC-822-style headers
  3. An optional data part

The headers and data are separated by a blank line.

The first line of the request has the form

where is a (case-sensitive) keyword such as GET or POST, is a string containing path information for the request, and should be the string "HTTP/1.0" or "HTTP/1.1". is encoded using the URL encoding scheme (using %xx to signify the ASCII character with hex code xx).

The specification specifies that lines are separated by CRLF but for compatibility with the widest range of clients recommends servers also handle LF. Similarly, whitespace in the request line is treated sensibly (allowing multiple spaces between components and allowing trailing whitespace).

Similarly, for output, lines ought to be separated by CRLF pairs but most clients grok LF characters just fine.

If the first line of the request has the form

(i.e. is left out) then this is assumed to be an HTTP 0.9 request; this form has no optional headers and data part and the reply consists of just the data.

The reply form of the HTTP 1.x protocol again has three parts:

  1. One line giving the response code
  2. An optional set of RFC-822-style headers
  3. The data

Again, the headers and data are separated by a blank line.

The response code line has the form

where is the protocol version ("HTTP/1.0" or "HTTP/1.1"), is a 3-digit response code indicating success or failure of the request, and is an optional human-readable string explaining what the response code means.

This server parses the request and the headers, and then calls a function specific to the request type (). Specifically, a request SPAM will be handled by a method do_SPAM(). If no such method exists the server sends an error response to the client. If it exists, it is called with no arguments:

do_SPAM()

Note that the request name is case sensitive (i.e. SPAM and spam are different requests).

The various request details are stored in instance variables:

  • client_address is the client IP address in the form (host, port);

  • command, path and version are the broken-down request line;

  • headers is an instance of email.message.Message (or a derived class) containing the header information;

  • rfile is a file object open for reading positioned at the start of the optional input data part;

  • wfile is a file object open for writing.

IT IS IMPORTANT TO ADHERE TO THE PROTOCOL FOR WRITING!

The first thing to be written must be the response line. Then follow 0 or more header lines, then a blank line, and then the actual data (if any). The meaning of the header lines depends on the command executed by the server; in most cases, when data is returned, there should be at least one header line of the form

Content-type: /

where and should be registered MIME types, e.g. "text/html" or "text/plain".

Expand source code Browse git
class Handler(BaseHTTPRequestHandler):
    def read_body(self) -> str:
        content_length = int(self.headers["content-length"])
        return self.rfile.read(content_length).decode("UTF-8")

    def do_PUT(self) -> None:
        global latest_release
        latest_release = self.read_body()
        self.send_response(200)
        self.end_headers()

    def do_POST(self) -> None:
        print(json.loads(self.read_body()))
        return self.do_GET()

    def do_GET(self) -> None:
        self.send_response(200)
        self.end_headers()
        body = json.dumps({"latest_release": latest_release})
        self.wfile.write(body.encode("UTF-8"))

Ancestors

  • http.server.BaseHTTPRequestHandler
  • socketserver.StreamRequestHandler
  • socketserver.BaseRequestHandler

Methods

def do_GET(self) ‑> None
Expand source code Browse git
def do_GET(self) -> None:
    self.send_response(200)
    self.end_headers()
    body = json.dumps({"latest_release": latest_release})
    self.wfile.write(body.encode("UTF-8"))
def do_POST(self) ‑> None
Expand source code Browse git
def do_POST(self) -> None:
    print(json.loads(self.read_body()))
    return self.do_GET()
def do_PUT(self) ‑> None
Expand source code Browse git
def do_PUT(self) -> None:
    global latest_release
    latest_release = self.read_body()
    self.send_response(200)
    self.end_headers()
def read_body(self) ‑> str
Expand source code Browse git
def read_body(self) -> str:
    content_length = int(self.headers["content-length"])
    return self.rfile.read(content_length).decode("UTF-8")