ELK Basics - Hinson's blog

Spread the love

Setup repository:

https://github.com/deviantony/docker-elk

The Elastic Stack, also known as the ELK Stack, is a collection of open-source tools for searching, analyzing, and visualizing log data in real-time. The stack includes Elasticsearch, Logstash, Kibana, and Beats. Each component serves a specific purpose within the data processing and analysis pipeline.

Beats

Beats are lightweight data shippers for various data sources. They are designed to collect and send data to Logstash or Elasticsearch. There are different types of Beats, each tailored for specific use cases:

Filebeat: Monitors log files and forwards log data.
Metricbeat: Collects system and service metrics.
Packetbeat: Captures network packets and provides network traffic data.
Winlogbeat: Collects Windows Event logs.
Auditbeat: Collects audit data from the Linux Audit Framework.
Heartbeat: Monitors the availability of services.

Logstash

Logstash is a data processing pipeline that ingests, transforms, and ships data. It works well with Beats, as it can process and enhance the data before sending it to Elasticsearch or other destinations. Logstash uses a configuration file that defines:

Inputs: Sources of data (e.g., Beats, log files, databases).
Filters: Processing steps to parse, enrich, and transform data (e.g., grok, mutate, date).

Elasticsearch

Elasticsearch is a distributed search and analytics engine. It is designed for scalability, high availability, and real-time search capabilities. Elasticsearch stores the processed data from Logstash and provides powerful search and aggregation capabilities. Key features include:

Full-text search: Supports complex queries and full-text search.
Real-time indexing: Allows near real-time search and data analysis.
Distributed architecture: Scales horizontally with clustering.

Kibana

Kibana is a visualization tool for Elasticsearch. It provides a web-based interface to explore, visualize, and analyze data stored in Elasticsearch. Key features include:

Dashboards: Create and share custom dashboards with visualizations.
Visualization: Offers a variety of visualization types (e.g., bar charts, pie charts, maps).
Discover: Allows users to explore and search raw data.
Timelion: Time-series data analysis and visualization.
Machine Learning: Anomaly detection and predictive analytics on time-series data.

How They Work Together

Data Collection: Beats agents are deployed on servers to collect logs, metrics, or network data.
Data Ingestion: Beats send the collected data to Logstash or directly to Elasticsearch.
Data Processing: Logstash processes and enriches the data using its pipeline of inputs, filters, and outputs.
Data Storage: Processed data is indexed and stored in Elasticsearch.
Data Visualization: Kibana provides a user interface to visualize and explore the data in Elasticsearch.

Example Workflow

Filebeat reads log files from an application server and forwards the logs to Logstash.
Logstash processes the logs, parsing the data and enriching it with additional context (e.g., adding geo-location based on IP addresses).
The processed data is then sent to Elasticsearch for indexing and storage.
Kibana is used to create dashboards that visualize the log data, enabling users to monitor application performance and detect issues in real time.

Filebeat configurations

# ============================ Filebeat inputs =============================

filebeat.inputs:

# ------------------------------ Log input ---------------------------------
- type: log
  enabled: true
  paths:
    - /var/log/*.log
    #- c:\programdata\elasticsearch\logs\*

  # Exclude lines. A list of regular expressions to match.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match.
  #exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

# ================= Filebeat modules =================

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

# ================= Elasticsearch template setting =================

setup.template.settings:
  index.number_of_shards: 1
  #index.codec: best_compression
  #_source.enabled: false

# ================= Kibana =================

setup.kibana:
  host: "localhost:5601"

# ================= Elastic Cloud =================

# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).
# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id: "..."

# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth: "elastic:changeme"

# ================= Outputs =================

# ----------------- Elasticsearch Output -----------------
output.elasticsearch:
  hosts: ["localhost:9200"]
  username: "elastic"
  password: "changeme"

# ----------------- Logstash Output -----------------
#output.logstash:
#  hosts: ["localhost:5044"]

# ================= Processors =================

# Configure processors to enhance or manipulate the data before sending it to the output
processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

# ================= X-Pack Monitoring =================

# Filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster. This requires xpack monitoring to be enabled in Elasticsearch.
# This is a very useful feature to track and monitor the health of the
# Filebeat instance.
#xpack.monitoring.enabled: false
#xpack.monitoring.elasticsearch:

# ================= Migration =================

# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true

Logstash Grok parse example

# Example log
Jun 25 12:05:15 myhost myprogram[12345]: This is a test log message

# Parse with Grok
%{SYSLOGTIMESTAMP:timestamp} %{SYSLOGHOST:host} %{WORD:program}\[%{NUMBER:pid}\]: %{GREEDYDATA:message}

# Configuration in Logstash
input {
  file {
    path => "/var/log/syslog"
    start_position => "beginning"
  }
}

filter {
  grok {
    match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{SYSLOGHOST:host} %{WORD:program}\[%{NUMBER:pid}\]: %{GREEDYDATA:message}" }
  }
}

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "syslog-%{+YYYY.MM.dd}"
  }
  stdout { codec => rubydebug }
}

Setup repository:

How They Work Together

Example Workflow

Filebeat configurations

Logstash Grok parse example

Leave a Reply Cancel reply