Top Prometheus Interview Questions and Answers (2024) | TechGeekNext


Top Prometheus Interview Questions and Answers (2024)

  1. What is Prometheus?
  2. What is the Architecture of Prometheus Monitoring?
  3. How do we manage and monitor Spring boot application in production using Prometheus?
  4. What are the Features of Prometheus?
  5. What are the Components of Prometheus?
  6. What database is used by Prometheus?
  7. What is PromQL?
  8. What are the different PromQL data types available in Prometheus Expression language?
  9. How To calculate the average request duration over the last 5 minutes from a histogram or summary?
  10. What is Gauges in Prometheus?
  11. What are Counters in Prometheus?
  12. What do you mean by Summaries and Histograms in Prometheus?
  13. What is the default data retention period in Prometheus?
  14. How to persist its data between restarts in Prometheus running in a Docker container?
  15. How do I check my Prometheus status?
  16. How do I trigger a Prometheus alert?
  17. What is Prometheus exporter?
  18. What is Thanos Prometheus?
  19. How do I delete old Prometheus data earlier than default retention time?
  20. How do you increase the time retention period in Prometheus?
  21. How do you increase the size retention policy?

Q: What is Prometheus?
Ans:

Prometheus is a free event monitoring and alerting software application. It logs real-time metrics in a time series database built with an HTTP pull model, allowing for flexible queries and real-time alerting.

Q: What is the Architecture of Prometheus Monitoring?
Ans:

Prometheus can run with both Go and Docker applications. The monitoring application consists of a time-series database, a user interface, and the PromQL query language, which is a flexible and intelligent query language.

Prometheus Monitoring Architecture Prometheus gathers metrics via instrumented jobs. The samples can be kept locally. It can also be scanned using rules to record or collect any new time-series via existing data and generate design alerts. Histograms, gauges, and counters are used to display these metrics. Plaintext data can be transmitted over HTTP.

Q: How do we manage and monitor Spring boot application in production using Prometheus?
Ans:

Spring Boot Actuator is a Spring Boot sub-project that adds monitoring and management capabilities for your production-ready apps. It offers a number of HTTP or JMX endpoints with which you can communicate. Actuator endpoint-discovery page

The Prometheus (a tool for monitoring) endpoint is provided by the Spring Boot Actuator, which regularly pulls this endpoint for metric data and offers graphic representation for data. We can see Api latency, performance etc in Prometheus Graph.
Below is Prometheus graph for testApi endpoint. Prometheus - Api Latency
Refer Spring Boot Actuator + Prometheus + Grafana Example

Q: What are the Features of Prometheus?
Ans:

The following are some of the most important aspects of Prometheus:

  • Numerous dashboards and graphing modes are available.
  • Representation of a time series grouping from any HTTP pull model.
  • Abilities to use PromQL to support a data model's multidimensionality feature.
  • Individual server nodes are self - reliant and do not rely on distributed storage.
  • It display time series data, i.e., identified with the metric's name or with KVP (Key-value pairs).

Take a look at our Suggested Posts :

Q: What are the Components of Prometheus?
Ans:

The majority of Prometheus components are written in a programming language, namely Go, and can be deployed and built as static binaries. A large number of its components are optional.

  • Prometheus Server
    Prometheus' server stores and scrapes metrics. It makes use of the persistence layer. This layer is part of the server and is not explicitly defined in the documentation. This server's nodes are all self-contained and do not rely on distributed storage.
  • Prometheus UI
    We can see charts/graphs, visualise, and access stored data using the web UI. Prometheus simplifies its user interface. We can also configure other visualisation tools, such as Grafana, to connect to the Prometheus server via the Prometheus Query Language (PromQL).
  • Prometheus Alertmanager
    Alertmanager sends alerts via client applications like the Prometheus server. It has advanced features for routing, grouping, and deduplicating alerts, and it can route alerts from other services like OpsGenie and PagerDuty.

Q: What database is used by Prometheus?
Ans:

Disk Time Series Database

Prometheus comes with a local on-disk time series database and also can integrate with remote storage systems.

Q: What is PromQL?
Ans:

Prometheus helps make its query language, Prometheus Query Language, easier to use (PromQL). It allows users to aggregate and select data. PromQL is specifically designed for use in conjunction with the Time-Series database. Prometheus includes four different types of metrics, that are mentioned below:

  1. Prometheus Gauge
  2. Prometheus Counter
  3. Prometheus Summary
  4. Prometheus Histogram

Q: What are the different PromQL data types available in Prometheus Expression language?
Ans:

An expression or sub-expression in Prometheus expression language can evaluate to one of four types:

  1. Instant vector

    A set of time series containing a single sample for each time series, all sharing the same timestamp.

    This example selects all time series with metric name http_requests_total metric name:

    http_requests_total

    We can filter the data further by using comma separated list of label matchers in curly braces ({}) also with matching operators like =, !=, =~ (for regex-match), !~ (do not regex-match)

    // selects data with the http_requests_total metric name that have job label set to prometheus 
    //and their group label set to canary
    http_requests_total{job="prometheus",group="canary"}
    
    //selects all http_requests_total time series for staging, testing, 
    //and development environments and HTTP methods other than GET.
    http_requests_total{environment=~"staging|testing|development",method!="GET"}
  2. Range vector

    A set of time series containing a range of data points over time for each time series.

    Time durations can be specified as ms, s, m, h, d (day with 24h),w (week of 7d),y (year with 365d).

    // Select all values recorded within the last 5 minutes for all time series 
    //with the metric name http requests total and a job label of prometheus:
    http_requests_total{job="prometheus"}[5m]
  3. Scalar
    A simple numeric floating point value.
  4. String

    A a simple string value, currently unused.

    There is no escaping inside backticks. Unlike Go, Prometheus does not discard newlines within backticks.

    "Example of string"
    'Example of unescaped: \n \\ \t'
    'Example of escaped: \n ' " \t'

Q: How To calculate the average request duration over the last 5 minutes from a histogram or summary?
Ans:

To calculate the average request duration over the last 5 minutes from a histogram or summary called http_request_duration_seconds, use the following expression:

rate(http_request_duration_seconds_sum[5m])
/
 rate(http_request_duration_seconds_count[5m])

Q: What is Gauges in Prometheus?
Ans:

A gauge is any metric that shows an individual value that can vary randomly up and down. Gauges are used to measure the values that are typically or recently used in memory.

Q: What are Counters in Prometheus?
Ans:

A counter is any cumulative metric that shows an individual increasing counter monotonically and whose value could only reset or increase to zero over a restart. For example, we can use a counter to represent the number of errors, completed tasks, and requests served.

Q: What do you mean by Summaries and Histograms in Prometheus?
Ans:

Prometheus supports two kinds of complex metrics: Summaries and Histograms.

These metrics are being used to keep track of the number of observations and the sum of observed values. It generates time series in the database. For example, they all add the suffix _sum to the observed value's sum.

Histogram
A histogram is used to represent the counts and observations (typically response size and request durations) in the configuration buckets. It also makes the sum of each observed value easier.

It makes a histogram an important choice for tracking things like latency, which may have SLO (Service Level Objective) defined across it.

Summary
A summary is used to represent different observations (like response size or request durations usually). It also displays the total number of observations and the sum of each observed value. On any sliding time window, it can calculate configurable quantities.

Q: What is the default data retention period in Prometheus?
Ans:

The default data retention period is 15 days in Prometheus. Data would be automatically deleted after the data storage default retention duration has passed.

Q: How to persist its data between restarts in Prometheus running in a Docker container?
Ans:

We can create/mount volume and persist application data between multiple restarts in Prometheus by using below command:

$ docker volume create a-new-volume
$ docker run \ --publish 9090:9090 \
--volume a-new-volume:/prometheus \
--volume "$(pwd)"/prometheus.yml:/etc/prometheus/prometheus.yml \ prom/prometheus

Q: How do I check my Prometheus status?
Ans:

check-node_exporter

Open your browser and go to http://localhost:9090 to test the Prometheus server installation. You should be able to see the Prometheus interface. Select Status, then Targets. Your machines should be listed as UP under State.

Q: How do I trigger a Prometheus alert?
Ans:

AlertManager is a single binary that receives alerts from the Prometheus server and sends them to the end user through email, Slack, or other means.

AlertManager

The following are the steps for setting up Prometheus alerts:

  • Configure and set up AlertManager.
  • Configure Prometheus' config file to allow it to communicate with the AlertManager.
  • In the Prometheus server configuration, define alert rules.
  • In AlertManager, create an alert mechanism to send alerts via Slack and email.

Q: What is Prometheus exporter?
Ans:

A Prometheus Exporter is a part of software that allows it to fetch statistics from another, non-Prometheus system. It converts those statistics into Prometheus metrics, using a client library. You can start a web server which exposes a /metrics URL, and can see that URL display the system metrics.

Q: What is Thanos Prometheus?
Ans:

Thanos is a "highly available Prometheus setup with long-term storage capability," to put it simply. Thanos enables you to query and aggregate data from several Prometheus instances from a single endpoint. Thanos also handles duplicate measurements that may result from several Prometheus instances automatically.

Q: How do I delete old Prometheus data earlier than default retention time?
Ans:

You have the option of deleting particular data early.

  • Before you can do so, you must first enable the admin api in Prometheus.
    sudo nano /etc/default/prometheus
  • Add --web.enable-admin-api to the ARGS="" variable.
    ARGS="--web.enable-admin-api"
  • Restart Prometheus and check status.
    sudo service prometheus restart
    sudo service prometheus status
  • You can now make calls to the admin api.
Example: Delete all time series for the instance="sbcode.net:9100"
curl -X POST -g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={instance="sbcode.net:9100"}'
NOTE: Once data is deleted, make sure you should disable the admin api again by following steps:
  1. sudo nano /etc/default/prometheus
  2. Remove --web.enable-admin-api from the ARGS variable.
    ARGS=""
    
  3. Restart Prometheus and check status
    sudo service prometheus restart
    sudo service prometheus status
    

Q: How do you increase the time retention period in Prometheus?
Ans:

Prometheus metrics are stored for 15 days by default. For troubleshooting purposes, this retention duration may be insufficient.

You can increase the time retention period in the Prometheus configuration file as given below:

  • Open the /etc/sysconfig/prometheus file on the management node, update the STORAGE RETENTION option to the required retention duration, and save your modifications.
    For example:
    STORAGE_RETENTION="--storage.tsdb.retention.time=30d"
  • Restart the Prometheus service:
    systemctl restart prometheus.service

Q: How do you increase the size retention policy?
Ans:

If the data is kept for a long time, the root partition where it is stored may run out of space. You can prevent this by setting the Prometheus metrics' maximum size.

  • Open the /etc/sysconfig/prometheus file on the management node, update the STORAGE RETENTION option to the required size retention duration, and save your modifications.
    For example:
    STORAGE_RETENTION="--storage.tsdb.retention.size=10GB"
  • Restart the Prometheus service:
    systemctl restart prometheus.service








Recommendation for Top Popular Post :