Standardized logging formats are important for teams that rely on logging for observability, troubleshooting, and workflow integration. Using structured formats simplifies parsing and eliminates the need to interpret fields manually, ensuring consistency across logging formats. This reduces manual work, prevents brittleness from unstructured logs, and simplifies integration between teams that feed logs into a shared aggregation system.
Standardized logging formats have become essential for teams relying on integration to aggregate logs from heterogeneous technologies and tools within their organizations. HAProxy users can now address this need more effectively using standardized logging formats.
With the release of HAProxy 3.0, we introduced new JSON and CBOR encoding options. Users can now format log lines as JSON or CBOR, making them easier to parse. Furthermore, when formatting log lines as CBOR, users also have the option to specify binary (+bin
) to reduce bandwidth.
Additionally, log format expressions were sometimes ambiguous about data type. Now, you can explicitly define data types for log format expression fields to improve consistency between logs.
These updates simplify log management and ensure better interoperability across systems. Furthermore, applications handling these structured logs benefit from enhanced performance, no longer needing to manually extract and format data from default logs.
Why did we add new encoding options to HAProxy?
HAProxy users have long wanted HAProxy logs to be in a standardized format to simplify the extraction of data, improve application performance, and ensure compatibility with existing logging facilities.
While default HAProxy logs are not standardized (often requiring custom parsers to extract relevant information), their design was intentional to accommodate earlier logging practices. In earlier versions of HAProxy, other structured formats such as JSON were not common yet, and HAProxy logs were primarily read directly by administrators with minimal need for parsing. As a result, HAProxy logs were designed to be less verbose and maintain backward compatibility.
However, logging needs have evolved. Users frequently manipulate the log format to make the output compliant with other systems. If users wanted HAProxy logs in a JSON or CBOR format, they had to convert them manually or use middleware to convert HAProxy logs to the desired format.
Furthermore, emitting structured logs with HAProxy was often difficult and problematic:
Quoting issues. The
%{+Q}
format often failed to quote string values consistently. For example, some string values (e.g.,%tsc
) were not quoted while others were. Also, numeric values leveraging sample expressions (e.g.,%[expression]
) were inconsistently quoted.Hexadecimal representation. The
%{+X}
format, hexadecimal representation for integers, was not consistently applied across different log format aliasesInconsistent null values. Null values were represented inconsistently, often as empty strings, sometimes as "-", and other times as
-1
. This inconsistency required treating everything as a string and manually adding quotes, rather than relying on%{+Q}
, to ensure valid encoding (e.g., avoiding invalid encoding in JSON like{"foo":}
or{"foo":-}
).String handling for numerical values. Some numerical values, such as
%ms
(padded output) or%rc
/%B
(leading characters), had to be handled as strings (for historical reasons).
HAProxy users had to spend additional effort creating custom parsers and handling log format issues manually. This resulted in increased complexity and inconsistencies in logging when trying to create JSON outputs, making it more difficult to achieve accurate and reliable log outputs.
The new JSON and CBOR encoding options address these issues by providing structured data logging options out-of-the-box.
JSON (JavaScript Object Notation) is a standard text-based format for storing and transporting data.
CBOR (Concise Binary Object Representation) is a binary data format often preferred for optimizing network bandwidth. Without specifying binary, CBOR is a hexadecimal format for interoperability purposes and can be more verbose than JSON and HAProxy’s default log format.
Together, these two encoding options give HAProxy users two widely used log formats, enabling better performance and integration with other systems and teams. The new log format encoders in HAProxy 3.0 provide a more standardized approach to logging, eliminating the need for custom parsers, middleware for conversions, and extensive log format adjustments. Users should find their logging setups are easier to maintain.
The advantages of HAProxy 3.0’s new encoding options
With HAProxy 3.0, users no longer require alternative solutions to convert HAProxy logs to JSON or CBOR. This native support simplifies the immense amount of logs generated by modern applications and microservices, making it easier for teams to operationalize and make sense of their data.
While alternatives like Syslog remain a standard, their lack of descriptiveness can make them less suitable for modern logging needs. JSON and CBOR offer more structure and interoperability. Even though these formats can be more verbose, the benefit of multi-team log consistency is a fair trade-off.
Encoding logs in JSON means you don’t need to manually craft a log yourself by “hacking” the log format. This makes logs encoded in JSON less error-prone and more consistent because manual adjustments are minimized. Furthermore, JSON-structured logs are easier to read and parse than default HAProxy logs, ultimately resulting in easier debugging and archiving experiences.
CBOR support in HAProxy 3.0 introduces a method for structuring logging that was previously unattainable through manual adjustments—it was never as simple as “hacking” the log format as it was for JSON. CBOR support improves interoperability with existing tools and products and reduces bandwidth usage by transmitting pure binary payload over the wire, provided the ring/syslog endpoint supports it.
Practical examples of JSON and CBOR structured logs
JSON equivalent of "option httplog"
When enabling option httplog
, HAProxy implicitly sets the proxy log-format
directive to the default HTTP access log formatted string, which can be accessed through the global environment variable named HAPROXY_HTTP_LOG_FMT
.
Setting option httplog
is equivalent to setting log-format
to ${HAPROXY_HTTP_LOG_FMT}
on an HTTP proxy:
mode http | |
log-format "${HAPROXY_HTTP_LOG_FMT}" |
As of HAProxy 3.0, HAPROXY_HTTP_LOG_FMT
is defined as the following variables, each variable representing a part of the request and response:
"%ci:%cp [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r" |
As a reminder, enabling option httplog
on a proxy will produce the following:
<134>1 2024-06-20T11:47:56.817934+02:00 - haproxy 266980 - - 127.0.0.1:50516 [20/Jun/2024:11:47:56.766] mybackend mybackend/s1 0/0/24/27/51 404 379 - - ---- 1/1/0/0/0 0/0 "GET /index.html HTTP/1.1" |
The log payload, shown after the log header, comes with the limitations mentioned earlier regarding log consistency and parsing.
To generate JSON structured logs instead, we can create our own log format string based on HAPROXY_HTTP_LOG_FMT
:
setenv HTTPLOG_JSON "%{+json}o %(client_ip)ci %(client_port)cp %(request_date)tr %(fe_name_transport)ft %(be_name)b %(server_name)s %(time_request)TR %(time_wait)Tw %(time_connect)Tc %(time_response)Tr/%(time_active)Ta %(status_code)ST %(bytes_read)B %(captured_request_cookie)CC %(captured_response_cookie)CS %(termination_state_cookie)tsc %(actconn)ac %(feconn)fc %(beconn)bc %(srv_conn)sc %(retries)rc %(srv_queue)sq %(backend_queue)bq %(captured_request_headers)hr %(captured_response_headers)hs %(http_request){+Q}r" |
If we configure HAProxy to use our own JSON log format:
log-format "${HTTPLOG_JSON}" |
HAProxy will generate logs like this (log payload following the log header, everything after - -
):
<134>1 2024-06-20T15:56:09.259280+02:00 - haproxy 279818 - - {"client_ip": "127.0.0.1", "client_port": "58734", "request_date": "20/Jun/2024:15:56:09.206", "fe_name_transport": "myfront", "be_name": "mybackend", "server_name": "s1", "time_request": 0, "time_wait": 0, "time_connect": 25, "time_response": 27, "time_active": 52, "status_code": 404, "bytes_read": 379, "captured_request_cookie": "", "captured_response_cookie": "", "termination_state_cookie": "----", "actconn": 1, "feconn": 1, "beconn": 0, "srv_conn": 0, "retries": 0, "srv_queue": 0, "backend_queue": 0, "captured_request_headers": null, "captured_response_headers": null, "http_request": "GET /index.html HTTP/1.1"} |
You can verify that no information from option httplog
is missing and that the result is fully JSON-compliant using a JSON validator.
CBOR (plain text) equivalent of “option httplog”
Let’s do the same thing for CBOR logs.
To generate CBOR (plain text) structured logs, we can create our own log-format
string that enables the CBOR encoding option by setting %{+cbor}o
:
setenv HTTPLOG_CBOR "%{+cbor}o %(client_ip)ci %(client_port)cp %(request_date)tr %(fe_name_transport)ft %(be_name)b %(server_name)s %(time_request)TR %(time_wait)Tw %(time_connect)Tc %(time_response)Tr/%(time_active)Ta %(status_code)ST %(bytes_read)B %(captured_request_cookie)CC %(captured_response_cookie)CS %(termination_state_cookie)tsc %(actconn)ac %(feconn)fc %(beconn)bc %(srv_conn)sc %(retries)rc %(srv_queue)sq %(backend_queue)bq %(captured_request_headers)hr %(captured_response_headers)hs %(http_request){+Q}r" |
If we configure HAProxy to use our own CBOR log format:
log-format "${HTTPLOG_CBOR}" |
HAProxy will generate logs like this:
<134>1 2024-06-20T16:00:56.444019+02:00 - haproxy|
You can verify CBOR compliance using the CBOR.me online tool.
As demonstrated, leveraging HAProxy’s new JSON and CBOR encoding is significantly easier than encoding the log payload yourself.
Conclusion
The addition of JSON and CBOR encoding in HAProxy 3.0 streamlines log management and improves interoperability between systems. With these standardized formats, HAProxy reduces the complexity of log extraction and formatting, making it simpler for teams to maintain consistency across their infrastructure.