How to Solve Circuit Breaker Exception [Data too large]

The ingestion of data is been done from Kafka to OpenSearch using Kafka connector, but after some time of successful ingestion, I am getting a circuit-breaking exception.

I am not understanding why I am getting this error, what exactly is the issue behind this? I need help with how to resolve this error.

Currently, the ingestion rate on Kafka is 1MBPS, and with almost the same rate it’s ingesting the data to OpenSearch, need to understand how to scale the OpenSearch cluster, as it seems the GC is not doing his job here.

Caused by: ElasticsearchException[Elasticsearch exception [type=circuit_breaking_exception, reason=[parent] Data too large, data for [indices:data/write/bulk[s]] would be [2041342996/1.9gb], which is larger than the limit of [2040109465/1.8gb], real usage: [2041339384/1.9gb], new bytes reserved: [3612/3.5kb], usages [request=0/0b, fielddata=1100/1kb, in_flight_requests=146910/143.4kb, accounting=11283158/10.7mb]]]\n\tat org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:496)\n\tat org.elasticsearch.ElasticsearchException.fromXContent(ElasticsearchException.java:407)\n\tat org.elasticsearch.action.bulk.BulkItemResponse.fromXContent(BulkItemResponse.java:139)\n\tat

Reference - What does this error mean - Data too large, data for [<transport_request>] - #7 by Dmitry1 - Elasticsearch - Discuss the Elastic Stack

For your reference, I am sharing my OpenSearch cluster information -

localhost:9200/_cat/nodes?v=true&h=name,node*,heap*

name                        id   node.role heap.current heap.percent heap.max
opensearch-cluster-master-0 go18 dimr           958.1mb           46      2gb
opensearch-cluster-master-1 UFLO dimr           517.6mb           25      2gb
opensearch-cluster-master-2 YeVY dimr           600.8mb           29      2gb

localhost:9200/_nodes/stats/breaker

{
    "_nodes": {
        "total": 3,
        "successful": 3,
        "failed": 0
    },
    "cluster_name": "opensearch-cluster",
    "nodes": {
        "go181GT5QauCMksdcSotvw": {
            "timestamp": 1640861546011,
            "name": "opensearch-cluster-master-0",
            "transport_address": "10.0.3.16:9300",
            "host": "10.0.3.16",
            "ip": "10.0.3.16:9300",
            "roles": [
                "data",
                "ingest",
                "master",
                "remote_cluster_client"
            ],
            "breakers": {
                "request": {
                    "limit_size_in_bytes": 1288490188,
                    "limit_size": "1.1gb",
                    "estimated_size_in_bytes": 0,
                    "estimated_size": "0b",
                    "overhead": 1.0,
                    "tripped": 0
                },
                "fielddata": {
                    "limit_size_in_bytes": 858993459,
                    "limit_size": "819.1mb",
                    "estimated_size_in_bytes": 0,
                    "estimated_size": "0b",
                    "overhead": 1.03,
                    "tripped": 0
                },
                "in_flight_requests": {
                    "limit_size_in_bytes": 2147483648,
                    "limit_size": "2gb",
                    "estimated_size_in_bytes": 122831,
                    "estimated_size": "119.9kb",
                    "overhead": 2.0,
                    "tripped": 0
                },
                "accounting": {
                    "limit_size_in_bytes": 2147483648,
                    "limit_size": "2gb",
                    "estimated_size_in_bytes": 9982094,
                    "estimated_size": "9.5mb",
                    "overhead": 1.0,
                    "tripped": 0
                },
                "parent": {
                    "limit_size_in_bytes": 2040109465,
                    "limit_size": "1.8gb",
                    "estimated_size_in_bytes": 529915392,
                    "estimated_size": "505.3mb",
                    "overhead": 1.0,
                    "tripped": 108
                }
            }
        },
        "UFLOrQz3QhOY2-vxekD7pg": {
            "timestamp": 1640861546012,
            "name": "opensearch-cluster-master-1",
            "transport_address": "10.0.2.24:9300",
            "host": "10.0.2.24",
            "ip": "10.0.2.24:9300",
            "roles": [
                "data",
                "ingest",
                "master",
                "remote_cluster_client"
            ],
            "breakers": {
                "request": {
                    "limit_size_in_bytes": 1288490188,
                    "limit_size": "1.1gb",
                    "estimated_size_in_bytes": 0,
                    "estimated_size": "0b",
                    "overhead": 1.0,
                    "tripped": 0
                },
                "fielddata": {
                    "limit_size_in_bytes": 858993459,
                    "limit_size": "819.1mb",
                    "estimated_size_in_bytes": 0,
                    "estimated_size": "0b",
                    "overhead": 1.03,
                    "tripped": 0
                },
                "in_flight_requests": {
                    "limit_size_in_bytes": 2147483648,
                    "limit_size": "2gb",
                    "estimated_size_in_bytes": 288912,
                    "estimated_size": "282.1kb",
                    "overhead": 2.0,
                    "tripped": 0
                },
                "accounting": {
                    "limit_size_in_bytes": 2147483648,
                    "limit_size": "2gb",
                    "estimated_size_in_bytes": 10745222,
                    "estimated_size": "10.2mb",
                    "overhead": 1.0,
                    "tripped": 0
                },
                "parent": {
                    "limit_size_in_bytes": 2040109465,
                    "limit_size": "1.8gb",
                    "estimated_size_in_bytes": 540055040,
                    "estimated_size": "515mb",
                    "overhead": 1.0,
                    "tripped": 0
                }
            }
        },
        "YeVYtrgnS9eeuzXFa2cijg": {
            "timestamp": 1640861546011,
            "name": "opensearch-cluster-master-2",
            "transport_address": "10.0.101.173:9300",
            "host": "10.0.101.173",
            "ip": "10.0.101.173:9300",
            "roles": [
                "data",
                "ingest",
                "master",
                "remote_cluster_client"
            ],
            "breakers": {
                "request": {
                    "limit_size_in_bytes": 1288490188,
                    "limit_size": "1.1gb",
                    "estimated_size_in_bytes": 16440,
                    "estimated_size": "16kb",
                    "overhead": 1.0,
                    "tripped": 0
                },
                "fielddata": {
                    "limit_size_in_bytes": 858993459,
                    "limit_size": "819.1mb",
                    "estimated_size_in_bytes": 0,
                    "estimated_size": "0b",
                    "overhead": 1.03,
                    "tripped": 0
                },
                "in_flight_requests": {
                    "limit_size_in_bytes": 2147483648,
                    "limit_size": "2gb",
                    "estimated_size_in_bytes": 240003,
                    "estimated_size": "234.3kb",
                    "overhead": 2.0,
                    "tripped": 0
                },
                "accounting": {
                    "limit_size_in_bytes": 2147483648,
                    "limit_size": "2gb",
                    "estimated_size_in_bytes": 10914988,
                    "estimated_size": "10.4mb",
                    "overhead": 1.0,
                    "tripped": 0
                },
                "parent": {
                    "limit_size_in_bytes": 2040109465,
                    "limit_size": "1.8gb",
                    "estimated_size_in_bytes": 496742400,
                    "estimated_size": "473.7mb",
                    "overhead": 1.0,
                    "tripped": 0
                }
            }
        }
    }
}
localhost:9200/_cat/health?v

epoch      timestamp cluster            status node.total node.data discovered_master shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1640864665 11:44:25  opensearch-cluster green           3         3              true    135  67    0    0        0             0                  -                100.0%

You may want to increase your heap I guess.
I have seen that error several time, workaround by increase heap

2 Likes

Also take a look at your shard settings and the number of data nodes configured.

1 Like

I have configure 2GB, but still its triggering circuit breaker [parent]

Settings -

{
    "defaults": {
        "indices.analysis.hunspell.dictionary.ignore_case": "false",
        "indices.analysis.hunspell.dictionary.lazy": "false",
        "indices.breaker.accounting.limit": "100%",
        "indices.breaker.accounting.overhead": "1.0",
        "indices.breaker.fielddata.limit": "40%",
        "indices.breaker.fielddata.overhead": "1.03",
        "indices.breaker.fielddata.type": "memory",
        "indices.breaker.request.limit": "60%",
        "indices.breaker.request.overhead": "1.0",
        "indices.breaker.request.type": "memory",
        "indices.breaker.total.limit": "95%",
        "indices.breaker.total.use_real_memory": "true",
        "indices.breaker.type": "hierarchy",
        "indices.cache.cleanup_interval": "1m",
        "indices.fielddata.cache.size": "-1b",
        "indices.id_field_data.enabled": "true",
        "indices.mapping.dynamic_timeout": "30s",
        "indices.mapping.max_in_flight_updates": "10",
        "indices.memory.index_buffer_size": "10%",
        "indices.memory.interval": "5s",
        "indices.memory.max_index_buffer_size": "-1",
        "indices.memory.min_index_buffer_size": "48mb",
        "indices.memory.shard_inactive_time": "5m",
        "indices.queries.cache.all_segments": "false",
        "indices.queries.cache.count": "10000",
        "indices.queries.cache.size": "10%",
        "indices.query.bool.max_clause_count": "1024",
        "indices.query.query_string.allowLeadingWildcard": "true",
        "indices.query.query_string.analyze_wildcard": "false",
        "indices.recovery.internal_action_long_timeout": "1800000ms",
        "indices.recovery.internal_action_timeout": "15m",
        "indices.recovery.max_bytes_per_sec": "40mb",
        "indices.recovery.max_concurrent_file_chunks": "2",
        "indices.recovery.max_concurrent_operations": "1",
        "indices.recovery.recovery_activity_timeout": "1800000ms",
        "indices.recovery.retry_delay_network": "5s",
        "indices.recovery.retry_delay_state_sync": "500ms",
        "indices.replication.initial_retry_backoff_bound": "50ms",
        "indices.replication.retry_timeout": "60s",
        "indices.requests.cache.expire": "0ms",
        "indices.requests.cache.size": "1%",
        "indices.store.delete.shard.timeout": "30s"
    }
}

Because it used more than 2Gb as the log said

 which is larger than the limit of [2040109465/1.8gb], real usage: [2041339384/1.9gb],