Reference


At the minimum, you will need to specify 'host', 'user', 'password', 'producer'. The kafka producer requires 'kafka.bootstrap.servers', the kinesis producer requires 'kinesis_stream'.

general

option argument description default
config STRING location of config.properties file $PWD/config.properties
log_level LOG_LEVEL log level info
daemon running maxwell as a daemon
env_config_prefix STRING env vars matching prefix are treated as config values

mysql

option argument description default
host STRING mysql host localhost
user STRING mysql username
password STRING mysql password (no password)
port INT mysql port 3306
jdbc_options STRING mysql jdbc connection options DEFAULT_JDBC_OPTS
ssl SSL_OPT SSL behavior for mysql cx DISABLED
schema_database STRING database to store schema and position in maxwell
client_id STRING unique text identifier for maxwell instance maxwell
replica_server_id LONG unique numeric identifier for this maxwell instance 6379 (see notes)
master_recovery BOOLEAN enable experimental master recovery code false
gtid_mode BOOLEAN enable GTID-based replication false
recapture_schema BOOLEAN recapture the latest schema. Not available in config.properties. false
max_schemas LONG how many schema deltas to keep before triggering compaction operation unlimited
 
replication_host STRING server to replicate from. See split server roles schema-store host
replication_password STRING password on replication server (none)
replication_port INT port on replication server 3306
replication_user STRING user on replication server
replication_ssl SSL_OPT SSL behavior for replication cx cx DISABLED
replication_jdbc_options STRING mysql jdbc connection options for replication server DEFAULT_JDBC_OPTS
 
schema_host STRING server to capture schema from. See split server roles schema-store host
schema_password STRING password on schema-capture server (none)
schema_port INT port on schema-capture server 3306
schema_user STRING user on schema-capture server
schema_ssl SSL_OPT SSL behavior for schema-capture server DISABLED
schema_jdbc_options STRING mysql jdbc connection options for schema server DEFAULT_JDBC_OPTS
 

producer options

option argument description default
producer PRODUCER_TYPE type of producer to use stdout
custom_producer.factory CLASS_NAME fully qualified custom producer factory class, see example
producer_ack_timeout PRODUCER_ACK_TIMEOUT time in milliseconds before async producers consider a message lost
producer_partition_by PARTITION_BY input to kafka/kinesis partition function database
producer_partition_columns STRING if partitioning by 'column', a comma separated list of columns
producer_partition_by_fallback PARTITION_BY_FALLBACK required when producer_partition_by=column. Used when the column is missing
ignore_producer_error BOOLEAN When false, Maxwell will terminate on kafka/kinesis/pubsub publish errors (aside from RecordTooLargeException). When true, errors are only logged. See also dead_letter_topic true

file producer

option argument description default
output_file STRING output file for file producer
javascript STRING file containing javascript filters

kafka producer

option argument description default
kafka.bootstrap.servers STRING kafka brokers, given as HOST:PORT[,HOST:PORT]
kafka_topic STRING kafka topic to write to. maxwell
dead_letter_topic STRING the topic to write a "skeleton row" (a row where data includes only primary key columns) when there's an error publishing a row. When ignore_producer_error is false, only RecordTooLargeException causes a fallback record to be published, since other errors cause termination. Currently only supported in Kafka publisher
kafka_version KAFKA_VERSION run maxwell with specified kafka producer version. Not available in config.properties. 0.11.0.1
kafka_partition_hash [ default | murmur3 ] hash function to use when choosing kafka partition default
kafka_key_format [ array | hash ] how maxwell outputs kafka keys, either a hash or an array of hashes hash
ddl_kafka_topic STRING if output_ddl is true, kafka topic to write DDL changes to kafka_topic

kinesis producer

option argument description default
kinesis_stream STRING kinesis stream name

sqs producer

option argument description default
sqs_queue_uri STRING SQS Queue URI

pubsub producer

option argument description default
pubsub_topic STRING Google Cloud pub-sub topic
pubsub_platform_id STRING Google Cloud platform id associated with topic
ddl_pubsub_topic STRING Google Cloud pub-sub topic to send DDL events to
pubsub_request_bytes_threshold LONG Set number of bytes until batch is send 1
pubsub_message_count_batch_size LONG Set number of messages until batch is send 1
pubsub_publish_delay_threshold LONG Set time passed in millis until batch is send 1
pubsub_retry_delay LONG Controls the delay in millis before sending the first retry message 100
pubsub_retry_delay_multiplier FLOAT Controls the increase in retry delay per retry 1.3
pubsub_max_retry_delay LONG Puts a limit on the value in seconds of the retry delay 60
pubsub_initial_rpc_timeout LONG Controls the timeout in seconds for the initial RPC 5
pubsub_rpc_timeout_multiplier FLOAT Controls the change in RPC timeout 1.0
pubsub_max_rpc_timeout LONG Puts a limit on the value in seconds of the RPC timeout 600
pubsub_total_timeout LONG Puts a limit on the value in seconds of the retry delay, so that the RetryDelayMultiplier can't increase the retry delay higher than this amount 600

rabbitmq producer

option argument description default
rabbitmq_user STRING Username of Rabbitmq connection guest
rabbitmq_pass STRING Password of Rabbitmq connection guest
rabbitmq_host STRING Host of Rabbitmq machine
rabbitmq_port INT Port of Rabbitmq machine
rabbitmq_virtual_host STRING Virtual Host of Rabbitmq
rabbitmq_exchange STRING Name of exchange for rabbitmq publisher
rabbitmq_exchange_type STRING Exchange type for rabbitmq
rabbitmq_exchange_durable BOOLEAN Exchange durability. false
rabbitmq_exchange_autodelete BOOLEAN If set, the exchange is deleted when all queues have finished using it. false
rabbitmq_routing_key_template STRING A string template for the routing key, %db% and %table% will be substituted. %db%.%table%.
rabbitmq_message_persistent BOOLEAN Eanble message persistence. false
rabbitmq_declare_exchange BOOLEAN Should declare the exchange for rabbitmq publisher true

redis producer

option argument description default
redis_host STRING Host of Redis server localhost
redis_port INT Port of Redis server 6379
redis_auth STRING Authentication key for a password-protected Redis server
redis_database INT Database of Redis server 0
redis_type [ pubsub | xadd | lpush | rpush ] Selects either Redis Pub/Sub, Stream, or List. pubsub
redis_key STRING Redis channel/key for Pub/Sub, XADD or LPUSH/RPUSH maxwell
redis_stream_json_key STRING Redis XADD Stream Message Field Name message
redis_sentinels STRING Redis sentinels list in format host1:port1,host2:port2,host3:port3... Must be only used with redis_sentinel_master_name
redis_sentinel_master_name STRING Redis sentinel master name. Must be only used with redis_sentinels

formatting

option argument description default
output_binlog_position BOOLEAN records include binlog position false
output_gtid_position BOOLEAN records include gtid position, if available false
output_commit_info BOOLEAN records include commit and xid true
output_xoffset BOOLEAN records include virtual tx-row offset false
output_nulls BOOLEAN records include fields with NULL values true
output_server_id BOOLEAN records include server_id false
output_thread_id BOOLEAN records include thread_id false
output_schema_id BOOLEAN records include schema_id, schema_id is the id of the latest schema tracked by maxwell and doesn't relate to any mysql tracked value false
output_row_query BOOLEAN records include INSERT/UPDATE/DELETE statement. Mysql option "binlog_rows_query_log_events" must be enabled false
output_primary_keys BOOLEAN DML records include list of values that make up a row's primary key false
output_primary_key_columns BOOLEAN DML records include list of columns that make up a row's primary key false
output_ddl BOOLEAN output DDL (table-alter, table-create, etc) events false
output_null_zerodates BOOLEAN should we transform '0000-00-00' to null? false
output_naming_strategy STRING naming strategy of field name of JSON. can be underscore_to_camelcase none

filtering

option argument description default
filter STRING filter rules, eg exclude: db.*, include: *.tbl, include: *./bar(bar)?/, exclude: foo.bar.col=val

encryption

option argument description default
encrypt [ none | data | all ] encrypt mode: none = no encryption. "data": encrypt the data field only. all: encrypt entire maxwell message none
secret_key string specify the encryption key to be used null

high availability

option argument description default
ha enable maxwell client HA
jgroups_config string location of xml configuration file for jGroups $PWD/raft.xml
raft_member_id string uniquely identify this node within jgroups-raft cluster

monitoring / metrics

option argument description default
metrics_prefix STRING the prefix maxwell will apply to all metrics MaxwellMetrics
metrics_type [slf4j | jmx | http | datadog] how maxwell metrics will be reported
metrics_jvm BOOLEAN enable jvm metrics: memory usage, GC stats, etc. false
metrics_slf4j_interval SECONDS the frequency metrics are emitted to the log, in seconds, when slf4j reporting is configured 60
http_port INT the port the server will bind to when http reporting is configured 8080
http_path_prefix STRING http path prefix for the server /
http_bind_address STRING the address the server will bind to when http reporting is configured all addresses
http_diagnostic BOOLEAN enable http diagnostic endpoint false
http_diagnostic_timeout MILLISECONDS the http diagnostic response timeout 10000
metrics_datadog_type [udp | http] when metrics_type includes datadog this is the way metrics will be reported, can only be one of [udp | http] udp
metrics_datadog_tags STRING datadog tags that should be supplied, e.g. tag1:value1,tag2:value2
metrics_age_slo INT Latency service level objective threshold in seconds (Optional). When set, a message.publish.age.slo_violation metric is emitted to Datadog if the latency exceeds the threshold
metrics_datadog_interval INT the frequency metrics are pushed to datadog, in seconds 60
metrics_datadog_apikey STRING the datadog api key to use when metrics_datadog_type = http
metrics_datadog_site STRING the site to publish metrics to when metrics_datadog_type = http us
metrics_datadog_host STRING the host to publish metrics to when metrics_datadog_type = udp localhost
metrics_datadog_port INT the port to publish metrics to when metrics_datadog_type = udp 8125

misc

option argument description default
bootstrapper [async | sync | none] bootstrapper type. See bootstrapping docs. async
init_position FILE:POSITION[:HEARTBEAT] ignore the information in maxwell.positions and start at the given binlog position. Not available in config.properties.
replay BOOLEAN enable maxwell's read-only "replay" mode: don't store a binlog position or schema changes. Not available in config.properties.
buffer_memory_usage FLOAT Determines how much memory the Maxwell event buffer will use from the jvm max memory. Size of the buffer is: buffer_memory_usage * -Xmx" 0.25

LOG_LEVEL: [ debug | info | warn | error ]

SSL_OPTION: [ DISABLED | PREFERRED | REQUIRED | VERIFY_CA | VERIFY_IDENTITY ]

PRODUCER_TYPE: [ stdout | file | kafka | kinesis | pubsub | sqs | rabbitmq | redis ]

DEFAULT_JDBC_OPTS: zeroDateTimeBehavior=convertToNull&connectTimeout=5000

PARTITION_BY: [ database | table | primary_key | transaction_id | column | random ]

PARTITION_BY_FALLBACK: [ database | table | primary_key | transaction_id ]

KAFKA_VERSION: [ 0.8.2.2 | 0.9.0.1 | 0.10.0.1 | 0.10.2.1 | 0.11.0.1 ]

PRODUCER_ACK_TIMEOUT: In certain failure modes, async producers (kafka, kinesis, pubsub, sqs) may simply disappear a message, never notifying maxwell of success or failure. This timeout can be set as a heuristic; after this many milliseconds, maxwell will consider an outstanding message lost and fail it.

Configuration methods


Maxwell is configurable via the command-line, a properties file, or the environment. The configuration priority is:

command line options > scoped env vars > properties file > default values

config.properties

Maxwell can be configured via a java properties file, specified via --config or named "config.properties" in the current working directory. Any command line options (except init_position, replay, kafka_version and daemon) may be specified as "key=value" pairs.

via environment

If env_config_prefix given via command line or in config.properties, Maxwell will configure itself with all environment variables that match the prefix. The environment variable names are case insensitive. For example, if maxwell is started with --env_config_prefix=FOO_ and the environment contains FOO_USER=auser, this would be equivalent to passing --user=auser.

Deployment scenarios


At a minimum, Maxwell needs row-level-replication turned on into order to operate:

[mysqld]
server_id=1
log-bin=master
binlog_format=row

GTID support

As of 1.8.0, Maxwell contains support for GTID-based replication. Enable it with the --gtid_mode configuration param.

Here's how you might configure your mysql server for GTID mode:

$ vi my.cnf

[mysqld]
server_id=1
log-bin=master
binlog_format=row
gtid-mode=ON
log-slave-updates=ON
enforce-gtid-consistency=true

When in GTID-mode, Maxwell will transparently pick up a new replication position after a master change. Note that you will still have to re-point maxwell to the new master.

GTID support in Maxwell is considered beta-quality at the moment; notably, Maxwell is unable to transparently upgrade from a traditional-replication scenario to a GTID-replication scenario; currently, when you enable gtid mode Maxwell will recapture the schema and GTID-position from "wherever the master is at".

RDS configuration

To run Maxwell against RDS, (either Aurora or Mysql) you will need to do the following:

  • set binlog_format to "ROW". Do this in the "parameter groups" section. For a Mysql-RDS instance this parameter will be in a "DB Parameter Group", for Aurora it will be in a "DB Cluster Parameter Group".
  • setup RDS binlog retention as described here. The tl;dr is to execute call mysql.rds_set_configuration('binlog retention hours', 24) on the server.

Split server roles

Maxwell uses MySQL for 3 different functions:

  1. A host to store the captured schema in (--host).
  2. A host to replicate from (--replication_host).
  3. A host to capture the schema from (--schema_host).

Often, all three hosts are the same. host and replication_host should be different if maxwell is chained off a replica. schema_host should only be used when using the maxscale replication proxy.

Multiple Maxwell Instances

Maxwell can operate with multiple instances running against a single master, in different configurations. This can be useful if you wish to have producers running in different configurations, for example producing different groups of tables to different topics. Each instance of Maxwell must be configured with a unique client_id, in order to store unique binlog positions.

With MySQL 5.5 and below, each replicator (be it mysql, maxwell, whatever) must also be configured with a unique replica_server_id. This is a 32-bit integer that corresponds to mysql's server_id parameter. The value you configure should be unique across all mysql and maxwell instances.