org.apache.kafka.common.errors.GroupAuthorizationException: Not authorized to access group: XXX

The other day my team ran into below issue. I did debugging to help them resolve this. The error was straight forward, issue with Authorization ie ACL (Access Control List) are not correctly setup.

[shri@xxxxx config]# /opt/kafka_2.11-0.10.1.1/bin/kafka-console-consumer.sh –bootstrap-server xxxxx:9093 —topic topic1 –consumer.config /opt/kafka_2.11-0.10.1.1/config/consumer-ssl.properties –from-beginning
[2017-06-08 23:06:15,290] WARN Error while fetching metadata with correlation id 1 : {topic1=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)
[2017-06-08 23:06:15,292] ERROR Unknown error when running consumer: (kafka.tools.ConsoleConsumer$)
org.apache.kafka.common.errors.GroupAuthorizationException: Not authorized to access group: group1

The way to debug issue,

1. List the ACL for the topic ie topic1 and consumer group ie group1

Current ACLs for resource `Topic:topic1`:
User:CN=consumer.shri.com,OU=IT,O=XXX,L=Austin,ST=TX,C=US has Allow permission for operations: Describe from hosts: *
User:CN=producer.shri.com,OU=IT,O=XXX,L=Austin,ST=TX,C=US has Allow permission for operations: Describe from hosts: *
User:CN=consumer.shri.com,OU=IT,O=XXX,L=Austin,ST=TX,C=US has Allow permission for operations: Read from hosts: *
User:CN=producer.shri.com,OU=IT,O=XXX,L=Austin,ST=TX,C=US has Allow permission for operations: Write from hosts: *

Current ACLs for resource `Group:group1`:
User:CN=consumer.shri.com,OU=IT,O=XXX,L=Dallas,ST=TX,C=US has Allow permission for operations: Read from hosts: *

2. List the certificate (keystore) being used by the consumer. As you can see there is minor difference in ACL setup for topic and group and cert being used by the consumer. ACL had location as Austin, while cert had location as Atlanta. We change cert to match what was in the ACL, that resolve the error.

The JKS subject is
[spatel@xxxxx config]# keytool -v -list -keystore test.jks
Certificate[1]:
Owner: CN=consumer.shri.com, OU=IT, O=XXX Inc., L=Atlanta, ST=TX, C=US

So its simple setup issue, make sure all the configuration and setup line up correctly.

SQL optimization using LISTAGG – Avoid repeat self join on child table.

I have scenario where there is OBJECTS table and it has child table ATTRIBUTES. The each row in OBJECTS can have different sets (ie different count, and types) of ATTRIBUTES.  (To make it easy to understand, i keep the tables simple).

CREATE TABLE objects
(
id INT,
name VARCHAR2(256),
PRIMARY KEY (id)
);

CREATE TABLE attributes
(
id INT,
object_id INT,
name VARCHAR2(256),
value VARCHAR2(256),
PRIMARY KEY (id),
CONSTRAINT fk_order FOREIGN KEY (object_id) REFERENCES objects(id)
);

INSERT into objects values(1, ‘toy’);
INSERT into attributes values(1, 1, ‘color’, ‘blue’);
INSERT into attributes values(2, 1, ‘weight’, ‘2lb’);
INSERT into attributes values(3, 1, ‘price’, ‘$25’);

 

INSERT into objects values(1, ‘food’);
INSERT into attributes values(2, 1, ‘weight’, ‘2lb’);
INSERT into attributes values(3, 1, ‘price’, ‘$25’);

Now we want list each OBJECT and corresponding attribute like.

ID NAME COLOR WEIGHT PRICE
1 toy blue 2lb $25

SELECT
o.id id,
o.name name,
a1.value color,
a2.value weight,
a3.value price
FROM
objects o,
attributes a1,
attributes a2,
attributes a3
WHERE
(o.id = a1.object_id and a1.name=’color’)
AND        (o.id = a2.object_id and a2.name=’weight’)
AND       (o.id = a3.object_id and a3.name =’price’)

This worked and gave us the required output. When i looked at this in code review i had some concerns with this approach. The ATTRIBUTES table is self joined 3 time and if  developer needed more attributes then he\she would have to rejoin ATTRIBUTES table that many more times. In production attribute table would have thousands of rows, this will turn into performance issue.

If we had this data in de-normalized form we would not have run into this issue. De-normalizing 2 tables is minor change, but in actual scenario it was connected to lot of other data, the de-normalization would lead to cascading effort. So i approached our DBA to pick their brains on this. He suggest use of LISTAGG, and with his help optimized query looked like below. 

select object_attr.id,
object_attr.name
LISTAGG(object_attr.value, ‘|—-|’) WITHIN GROUP (ORDER BY object_attr.attr_name) attribute
from (select o.id, o.name, a.name attr_name, a.value
from objects o,
attributes a
where a.name in (‘color’,’weight’,’price’)
and o.id = a.object_id
) object_attr
GROUP BY object_attr.id, object_attr.name
ORDER BY object_attr.id

ID NAME ATTRIBUTE
1 toy blue|—-|$25|—-|2lb

Explain plan for self joining un-optimized query was

Operation                            Node Cost             Cost CPU           Cost I/O         Cost Optimizer            Cardinality        Bytes Position          Partition Start       Partition Stop              Partition Id
SELECT STATEMENT       0.0 %                      1160                   107,161,454          1150                           ALL_ROWS                  39                            34671                1160

Explain plan for optimized query was

Operation                           Node Cost              Cost CPU             Cost I/O         Cost Optimizer             Cardinality          Bytes Position         Partition Start         Partition Stop        Partition Id
SELECT STATEMENT       0.0 %                     391                       47,603,589               387                            ALL_ROWS                193                              61374                        391

As always, with time we updated the query to get more attributes from ATTRIBUTES Table. The un-optimized query’s CPU and IO cost would have gone up exponentially with each new attribute fetch, but in our optimized query cost was pretty much constant. We had to do little more to parse the attribute field in java.

Debug SSL issue – part 2 (2 way SSL)

I covered 1 way SSL debug in – Debug SSL issue – part 1 (1 way SSL)

The only difference with 2 way SSL is additional step to verify the client certificate.

  • In *** ServerHello, TLSv1.2, the server challenges client to provide its certificate as well, you see below at the end of the server hello

*** CertificateRequest
Cert Types: RSA, DSS
Cert Authorities:
<CN=XXXXXXXXXXxXXXXX>
<OU=XXXXXXXXXXXXXXX>
<CN=XXXXXXXXXXXXXXXx>

  • Basically server is asking client provide a certificate that signed by any of the certificate authority (CA) provided in the list. Server only trust these CAs.
  • Client does look up in keystore \ identity store to find cert that match the list above. If it find one, it sends that cert to server.
  • Server that validates the cert sent by client, if it find that cert or cert chain in trust store, it prints

matching alias: XXXXXX
*** Certificate chain
chain [0] = […………

  • The next steps are similar to 1 way SSL.

 

 

Debug SSL issue – part 1 (1 way SSL)

In java to debug SSL issue, add -Djavax.net.debug=ssl to java command line argument. This can be added either to server side or client side and will print the SSL handshake details between the client server on standard out.

I assuming this 1 way SSL connection from client to server. For details on 1 way SSL – https://blogshri.wordpress.com/2014/08/24/ssl-part-3-https-communication-type/

Caution

  1. SSL debug is every verbose and prints a lot in log, so if you doing debug on the server side, it may be good to limit communication to server from client you are facing issue with.
  2. The exact output you may see may vary based on version of Java Runtime (JRE), i will try to keep this discuss generic.
  3. In enterprise environment there are lot of component between client and server – proxy, firewall, load balancer etc, so first debug it to make your request from client is making to server. Look at SSL debug only once the you have establish the communication from client is reaching server. And its failing in SSL handshake between client and server.

Once you have this added, let me explain how to understand the details that are printed.

  • The first thing it will print is *** ClientHello, TLSv1.2, this indicate the client request is making it to server and the connection protocol is TLSv1.2. After this line it will print bunch of details like cipher suite, extension that client accepts, rejects etc.
  • The server reciprocates with server hello *** ServerHello, TLSv1.2, this indicate the server is received the request and protocol that it will use. Also after it indicates the cipher it has selected. If there are cipher suite in common, you will see the error – no cipher suites in common.

Then as part of server hello, server presents it certificate

RandomCookie: GMT: 1495682071 bytes = { 188, 152, 123, 228, 111, 6, 216, 173, 128, 2,
chain [0] = [
[
Version: V3
Subject: XXXXXXXXXXXXXXXXXXXXXXXXXX
Signature Algorithm: SHA1withRSA, OID = 1.2.840.113549.1.1.5

Key: Sun RSA public key, 2048 bits
modulus: XXXXXXXXXXXXXXXXXXXxxxx
public exponent: 65537
Validity: [From: Thu May 12 14:47:31 CDT 2016,
To: Wed May 13 14:47:31 CDT 2026]
Issuer: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
SerialNumber: [ c8]

  • And prints *** ServerHelloDone.
  • Now the client has to validate this certificate against the cert that present in its trust store. So client side you should see something like

***
Found trusted certificate:
[
[

If you dont have right certificate or CA in trust store in client side or if trust store if not correctly configured,  you will see error at this point in the log like – unable to find valid certification path

  • If certificate is found in client key store, client server exchange the symmetric key (based on cipher suite used) that will be used for any further communication.

 

 

Java One Session Notes

I recently attend java one (2016). My notes from that.

  • Java Technology and Ecosystem: Everything You Wanted to Know
    • Openjdk and oracle jdk 0 should be be able to use interchangable.
    • Prefer binary message format over json\xml
  • Microservices
    • N\w latency is never zero.
    • Containerize  MS as much as possible.
    • http://jav.mn/dockerjee
    • Container – standardize interface between dev and ops.
    • https://github.com/eldermoraes/javaoneus2016
    • Jelastic Docker- can migrate the container with its internal state
    • History – Monolith – to maximize the hardware and software cost
    • Microservices Minus the Hype: How to Build and Why [CON6462] – PPT greatone.
    • Api first mindset.
    • Playdoh principle.
    • 12 factor manifesto.
  • Modularization (Java Jigsaw)
    • Java.base will be base
    • Only technology implemented in one module
    • Java –list-modules
    • Java –list-modules java.base
    • Jlink – links set of modules and creates runtime.
    • Dependency and readability graph.
    • Public != accessible
    • Classpath and module path can be combined.
  • The Developers Guide to Performance Tunning.
    • Data-lite performance testing anti-pattern.
    • Even with production volume data is not sufficient, one needs to simulate exact production input.
    • Find out the dominating consumer of CPU – Appl, JVM, Kernel, nothing dominating.
    • The Diabolical Developer’s Guide to Performance Tuning
  • Performance Tuning and How to Upscale to Analyze in a Cluster Deployment
    • Use IBM Health Center (completely open source)
    • Works with IBM, open jdk. With oracle JDK does not give profiling info.
    • JMX – one connection per client
    • MQTT – multiple connection per client
    • Symptoms- Lower throughput, higher than expected to CPU, Application unresponsive.
    • Bottlenecks – GC, I\O, CPU, Resource contention – Locking.
    • Post mortem Tools – GC logs, Appilcation logs, heap dump, thread dump.
    • Live monitoring
    • HealthCenter (IBM tool) avialable in IBM Marketplace. (native agent)
    • Cloudbased – ELK stack
    • https://github.com/RuntimeTools/
  • Supercharge Your Java Code to Get Optimal Database Performance
    • Performance = latency * throughput
    • Latency = time to process UoW
    • Thoughput = amount of UoW can be execute in parallel
    • Commit only when needed.
    • Batching of transaction. Row by Row = Slow by Slow
    • Put the logic where it makes sense
    • Does not rollback entire batch – Insert into test … log errors;
    • Pls use bind variable
    • https://github.com/gvenzl/Oracle-JavaOneSuperchargeCodeOptimalDBPerf
  • Threaddump
    • 8 way to take threadump – jstack, kill -3, Visualvm, JMC, Windows Ctl+break, ThreadMXBean, APMTools, jcmd
    • https://Blog.fasthread.io
    • Causes – Garbage collection or Infinite loop
    • 8 Type of OutofMemory error
    • Java Heap Space
    • Thread mill pattern – Creating new thread
    • Leprechaun pattern – Blocking the finalizer thread
    • RSI – Repetitive Stress Injury pattern – Creating unlimited thread
    • Traffic Jam pattern – Synchronized method.
  • Java 8
    • Faster than previous version – Fork\join, ParallelStream
    • Lambda – Fewer line of code.
    • New Solution, Datetime api.
    • Minimize error @optional
    • Automated testing important – to make sure nothing breaks
    • Performance test – impact refactoring on performance.
    • https://github.com/mongodb/morphia
    • http://bit.ly/refJ8
  • Hystrix
    • to track the nake call made without hystrix
    • Clean any unnecessary calls
    • Build dashboard around the agent log to track outgoing calls.
    • Everything outgoing is around with Hystrix.
    • Hystrix Audit agent –
    • Mindset change – code for resiliency
    • Unit test and integration with Wiremock
    • Hsytrix metric stream.
    • Turbine – aggregate all stream in one client.
    • Hystrix Dashboard – D3 dashboard.
    • Netflix Altas – Analyzing historic data.
    • Not appcicable to batch and streaming usecase.
    • https://github.com/billyy/Hystrix-Tutorial
  • Art of backward compatibility
    • Command line serialver argument.
    • Ignore unknown, define default value – careful about this.
    • Semantic Compatibility
    • Object serialization bypasses constructor.
    • Custom serialization \deserialization
    • Bridge method injection
  • HTTP2 api
    • Each request has its own stream.
    • Each stream has its own flow.
    • Can reset single stream wiithout impacting others.
    • Server push
    • Prioritize of request.
    • Header Compress (HPACK)
    • Binary exchange protocol
    • Completable Future Api
  • CDI
    • Java CDI has more to offer over above spring, juice, seam

Yikes! Ask timed out on [ActorSelection[Anchor(akka://kafka-manager-system/), Path(/user/kafka-manager)]] after [5000 ms]

I am looking at Yahoo Kafka manager as admin console for my kafka cluster. I followed the instruction on the github repo to build the artificat. (But skipped the running instruction section.)

My local environment is windows.

shrik@DESKTOP-GLB3R08 C:\kafka-manager-1.3.0.7\bin
> kafka-manager.bat

i got the kafka manager ui at http://localhost:9000, but i tried to add kafka cluster i got the below error on the screen

Yikes! Ask timed out on [ActorSelection[Anchor(akka://kafka-manager-system/), Path(/user/kafka-manager)]] after [5000 ms]

and following error in log file

[warn] o.a.c.ConnectionState – Connection attempt unsuccessful after 61188 (greater than max timeout of 60000). Resetting connection and trying again with a new connection.
[info] o.a.z.ZooKeeper – Initiating client connection, connectString=kafka-manager-zookeeper:2181 sessionTimeout=60000 watcher=org.apache.curator.ConnectionState@628bc91a
[error] a.a.OneForOneStrategy – exception during creation
akka.actor.ActorInitializationException: exception during creation
at akka.actor.ActorInitializationException$.apply(Actor.scala:166) ~[com.typesafe.akka.akka-actor_2.11-2.3.14.jar:na]
at akka.actor.ActorCell.create(ActorCell.scala:596) ~[com.typesafe.akka.akka-actor_2.11-2.3.14.jar:na]
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:456) ~[com.typesafe.akka.akka-actor_2.11-2.3.14.jar:na]

The issue was i did not have zookeeper host configured correctly.

There 2 ways you can set this either by command line

kafka-manager -Dkafka-manager.zkhosts=”localhost:2181″ -Dhttp.port=9999

or

via application.conf property file.
kafka-manager.zkhosts=”kafka-manager-zookeeper:2181″ # this is default value, change it to point to zk instance.

(-Dhttp.port parameter is port number for Kafka manager application, so it will run on 9999, instead default port of 9000).

 

Scala – Function vs Procedural quick sort performance comparison

Scala is becoming dominant in many scalable, distributed framework \ toolset. Kafka, Spark, Akka are developed in Scala. I recently had to used both Kafka and Spark, and understanding Scala make it easy to debug the framework code. So i started learning Scala. I am reading Scala By Example. 

One of first example is quick sort. In that author Martin Odersky describe the imperative (i call it procedural) way vs functional way. Martin explains – “But
where the imperative implementation operates in place by modifying the argument array, the functional implementation returns a new sorted array and leaves the argument array unchanged. The functional implementation thus requires more transient memory than the imperative one.”

So I wanted to find out how performance and memory usage between the 2 approaches. The code is checked in @ https://github.com/shrikantpatel/scala_learning/blob/master/src/QuickSortComparision.scala

The output for arraySize (10, 100, 1000, 10000, 100000, 1000000, 50000000) is

**********************************
array size – 10
Functional Sort Time : 64
Procedural Sort Time : 0
**********************************
array size – 100
Functional Sort Time : 13
Procedural Sort Time : 0
**********************************
array size – 1000
Functional Sort Time : 59
Procedural Sort Time : 1
**********************************
array size – 10000
Functional Sort Time : 75
Procedural Sort Time : 2
**********************************
array size – 100000
Functional Sort Time : 344
Procedural Sort Time : 12
**********************************
array size – 1000000
Functional Sort Time : 2159
Procedural Sort Time : 104
**********************************
array size – 50000000
Functional Sort Time : 112694
Procedural Sort Time : 6390

Below is jconsole snapshot for final run of array size of 50000000. The red highlighted point indicate where the Functional sort completes and the Procedural sort starts. ( I put readline between 2 executions, so i wait for few seconds before 2nd execution). So as obvious we see memory usage in 1.5Gb in functional style vs 0.3Gb in procedural style. Also CPU hovers around 25% for entire duration of functional style.

I am still learning about Scala. I believe Author’s reason for putting this in the beginning it to caution developers to being mindful of where to use procedural \ object oriented way and where to use functional way.

QuickSortScala_Comparision