Issue with Log4j2 on AIX IBM JRE on WL12c

Detailed about the Log4j2 not working with WL12c issue.

There were multiple issues. Basically I can classify them in 2 categories

  • Log4j2 had includes functionality for Xinclude. It ran without issue on Dev’s local machine (Sun java runtime) but gave below error on QA server (IBM runtime). The IBM runtime failed parsing the log4j2 configuration with error –

<Sep 11, 2017 9:28:30 PM GMT> <Notice> <StdErr> <xxxxxxx> <xxxxxxx> <ExecuteThread: ‘4’ for queue: ‘weblogic.kernel.System’> <<WLS Kernel>> <> <> <1505165310988> <BEA-000000> <ERROR StatusLogger Error parsing /opt/bea/domains/xxxxx/xxxxx/log4j2.xml>

<Sep 11, 2017 9:28:30 PM GMT> <Notice> <StdErr> <xxxxx> <xxxxxx> <ExecuteThread: ‘4’ for queue: ‘weblogic.kernel.System’> <<WLS Kernel>> <> <> <1505165310994> <BEA-000000> <javax.xml.parsers.ParserConfigurationException: Feature ‘http://apache.org/xml/features/xinclude&#8217; is not recognized.>

Xerces supported the Xinclude functionality in 2.8 (2.7 supports as well but had some bug related to Xinclude).

So we did 2 things –

    1. explicitly include xerces version we needed in ear and
    2. modified weblogic-application.xml to force the class loader to load our version over the version provide by WL container. (container also 2.8.x version but we didn’t want to rely on container)

Even after the above changes, the error persisted. It turns out in AIX JRE there is flag that indicates if Xerces uses Xinclude aware parser. This would be enabled either at JDK level or as startup parameter. We included

-Dorg.apache.xerces.xni.parser.XMLParserConfiguration=org.apache.xerces.parsers.XIncludeAwareParserConfiguration

This fixed the Xinclude issue.

  • Now the issue was the Log4j2 configuration was not being honored. That is, If package x.y.z was configured at debug level in log4j property, we still did not see any debug logs for x.y.z in the log file.

The issue was their were multiple logging jar that were include in ear, in domain’s lib directory and other places, there no clarity of which jar were taking precedence in the class loader hierarchy.

  1. So I worked with dev to clean up the unwanted jars.
  2. We explicitly included the needed slf4j and log4j jar in the ear itself.
  3. Modified weblogic-application.xml to force the class loader to load our version over the version provide by WL container.

 

Advertisements

Network bandwidth between client and server

I was observing degradation in performance between my client and server on cloud deployment. I was researching if network latency was the issue and how to measure network bandwidth between client and server.

I started by looking at existing tools that would help my investigation. The first tool that i got interested based on my reading of blogs etc was Iperf3

From my experience so far, its very straight forward and easy to use. Its written in the C.

  1. Download the source code.

wget http://downloads.es.net/pub/iperf/iperf-3.1.6.tar.gz

tar -xvf iperf-3.1.6.tar.gz

cd iperf-3.1.6

  1. Next install the c compiler for building the source code.

My environment was AWS, so it easy to install development tools which has c compiler included (and more)

sudo yum groupinstall “Development Tools”

  1. Compile, build the iperf from code base.

sudo -s (I got some permission issue when it tried to install with ec2-user)

./configure; make; make install

(The build will fail on AIX because of this bug – https://github.com/esnet/iperf/issues/312)

  1. Start iperf on the server using the server mode option

./src/iperf3 -s

  1. Start iperf on the client using the client mode option

./src/iperf3 -x.x.x.x -t 10000

Larger image content at https://blogshri.files.wordpress.com/2017/08/iperf.jpg?w=1400

 

iperf

The iperf3 documentation has the details of all the option and possibilities.

org.apache.kafka.common.errors.GroupAuthorizationException: Not authorized to access group: XXX

The other day my team ran into below issue. I did debugging to help them resolve this. The error was straight forward, issue with Authorization ie ACL (Access Control List) are not correctly setup.

[shri@xxxxx config]# /opt/kafka_2.11-0.10.1.1/bin/kafka-console-consumer.sh –bootstrap-server xxxxx:9093 —topic topic1 –consumer.config /opt/kafka_2.11-0.10.1.1/config/consumer-ssl.properties –from-beginning
[2017-06-08 23:06:15,290] WARN Error while fetching metadata with correlation id 1 : {topic1=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)
[2017-06-08 23:06:15,292] ERROR Unknown error when running consumer: (kafka.tools.ConsoleConsumer$)
org.apache.kafka.common.errors.GroupAuthorizationException: Not authorized to access group: group1

The way to debug issue,

1. List the ACL for the topic ie topic1 and consumer group ie group1

Current ACLs for resource `Topic:topic1`:
User:CN=consumer.shri.com,OU=IT,O=XXX,L=Austin,ST=TX,C=US has Allow permission for operations: Describe from hosts: *
User:CN=producer.shri.com,OU=IT,O=XXX,L=Austin,ST=TX,C=US has Allow permission for operations: Describe from hosts: *
User:CN=consumer.shri.com,OU=IT,O=XXX,L=Austin,ST=TX,C=US has Allow permission for operations: Read from hosts: *
User:CN=producer.shri.com,OU=IT,O=XXX,L=Austin,ST=TX,C=US has Allow permission for operations: Write from hosts: *

Current ACLs for resource `Group:group1`:
User:CN=consumer.shri.com,OU=IT,O=XXX,L=Dallas,ST=TX,C=US has Allow permission for operations: Read from hosts: *

2. List the certificate (keystore) being used by the consumer. As you can see there is minor difference in ACL setup for topic and group and cert being used by the consumer. ACL had location as Austin, while cert had location as Atlanta. We change cert to match what was in the ACL, that resolve the error.

The JKS subject is
[spatel@xxxxx config]# keytool -v -list -keystore test.jks
Certificate[1]:
Owner: CN=consumer.shri.com, OU=IT, O=XXX Inc., L=Atlanta, ST=TX, C=US

So its simple setup issue, make sure all the configuration and setup line up correctly.

SQL optimization using LISTAGG – Avoid repeat self join on child table.

I have scenario where there is OBJECTS table and it has child table ATTRIBUTES. The each row in OBJECTS can have different sets (ie different count, and types) of ATTRIBUTES.  (To make it easy to understand, i keep the tables simple).

CREATE TABLE objects
(
id INT,
name VARCHAR2(256),
PRIMARY KEY (id)
);

CREATE TABLE attributes
(
id INT,
object_id INT,
name VARCHAR2(256),
value VARCHAR2(256),
PRIMARY KEY (id),
CONSTRAINT fk_order FOREIGN KEY (object_id) REFERENCES objects(id)
);

INSERT into objects values(1, ‘toy’);
INSERT into attributes values(1, 1, ‘color’, ‘blue’);
INSERT into attributes values(2, 1, ‘weight’, ‘2lb’);
INSERT into attributes values(3, 1, ‘price’, ‘$25’);

 

INSERT into objects values(1, ‘food’);
INSERT into attributes values(2, 1, ‘weight’, ‘2lb’);
INSERT into attributes values(3, 1, ‘price’, ‘$25’);

Now we want list each OBJECT and corresponding attribute like.

ID NAME COLOR WEIGHT PRICE
1 toy blue 2lb $25

SELECT
o.id id,
o.name name,
a1.value color,
a2.value weight,
a3.value price
FROM
objects o,
attributes a1,
attributes a2,
attributes a3
WHERE
(o.id = a1.object_id and a1.name=’color’)
AND        (o.id = a2.object_id and a2.name=’weight’)
AND       (o.id = a3.object_id and a3.name =’price’)

This worked and gave us the required output. When i looked at this in code review i had some concerns with this approach. The ATTRIBUTES table is self joined 3 time and if  developer needed more attributes then he\she would have to rejoin ATTRIBUTES table that many more times. In production attribute table would have thousands of rows, this will turn into performance issue.

If we had this data in de-normalized form we would not have run into this issue. De-normalizing 2 tables is minor change, but in actual scenario it was connected to lot of other data, the de-normalization would lead to cascading effort. So i approached our DBA to pick their brains on this. He suggest use of LISTAGG, and with his help optimized query looked like below. 

select object_attr.id,
object_attr.name
LISTAGG(object_attr.value, ‘|—-|’) WITHIN GROUP (ORDER BY object_attr.attr_name) attribute
from (select o.id, o.name, a.name attr_name, a.value
from objects o,
attributes a
where a.name in (‘color’,’weight’,’price’)
and o.id = a.object_id
) object_attr
GROUP BY object_attr.id, object_attr.name
ORDER BY object_attr.id

ID NAME ATTRIBUTE
1 toy blue|—-|$25|—-|2lb

Explain plan for self joining un-optimized query was

Operation                            Node Cost             Cost CPU           Cost I/O         Cost Optimizer            Cardinality        Bytes Position          Partition Start       Partition Stop              Partition Id
SELECT STATEMENT       0.0 %                      1160                   107,161,454          1150                           ALL_ROWS                  39                            34671                1160

Explain plan for optimized query was

Operation                           Node Cost              Cost CPU             Cost I/O         Cost Optimizer             Cardinality          Bytes Position         Partition Start         Partition Stop        Partition Id
SELECT STATEMENT       0.0 %                     391                       47,603,589               387                            ALL_ROWS                193                              61374                        391

As always, with time we updated the query to get more attributes from ATTRIBUTES Table. The un-optimized query’s CPU and IO cost would have gone up exponentially with each new attribute fetch, but in our optimized query cost was pretty much constant. We had to do little more to parse the attribute field in java.

Debug SSL issue – part 2 (2 way SSL)

I covered 1 way SSL debug in – Debug SSL issue – part 1 (1 way SSL)

The only difference with 2 way SSL is additional step to verify the client certificate.

  • In *** ServerHello, TLSv1.2, the server challenges client to provide its certificate as well, you see below at the end of the server hello

*** CertificateRequest
Cert Types: RSA, DSS
Cert Authorities:
<CN=XXXXXXXXXXxXXXXX>
<OU=XXXXXXXXXXXXXXX>
<CN=XXXXXXXXXXXXXXXx>

  • Basically server is asking client provide a certificate that signed by any of the certificate authority (CA) provided in the list. Server only trust these CAs.
  • Client does look up in keystore \ identity store to find cert that match the list above. If it find one, it sends that cert to server.
  • Server that validates the cert sent by client, if it find that cert or cert chain in trust store, it prints

matching alias: XXXXXX
*** Certificate chain
chain [0] = […………

  • The next steps are similar to 1 way SSL.

 

 

Debug SSL issue – part 1 (1 way SSL)

In java to debug SSL issue, add -Djavax.net.debug=ssl to java command line argument. This can be added either to server side or client side and will print the SSL handshake details between the client server on standard out.

I assuming this 1 way SSL connection from client to server. For details on 1 way SSL – https://blogshri.wordpress.com/2014/08/24/ssl-part-3-https-communication-type/

Caution

  1. SSL debug is every verbose and prints a lot in log, so if you doing debug on the server side, it may be good to limit communication to server from client you are facing issue with.
  2. The exact output you may see may vary based on version of Java Runtime (JRE), i will try to keep this discuss generic.
  3. In enterprise environment there are lot of component between client and server – proxy, firewall, load balancer etc, so first debug it to make your request from client is making to server. Look at SSL debug only once the you have establish the communication from client is reaching server. And its failing in SSL handshake between client and server.

Once you have this added, let me explain how to understand the details that are printed.

  • The first thing it will print is *** ClientHello, TLSv1.2, this indicate the client request is making it to server and the connection protocol is TLSv1.2. After this line it will print bunch of details like cipher suite, extension that client accepts, rejects etc.
  • The server reciprocates with server hello *** ServerHello, TLSv1.2, this indicate the server is received the request and protocol that it will use. Also after it indicates the cipher it has selected. If there are cipher suite in common, you will see the error – no cipher suites in common.

Then as part of server hello, server presents it certificate

RandomCookie: GMT: 1495682071 bytes = { 188, 152, 123, 228, 111, 6, 216, 173, 128, 2,
chain [0] = [
[
Version: V3
Subject: XXXXXXXXXXXXXXXXXXXXXXXXXX
Signature Algorithm: SHA1withRSA, OID = 1.2.840.113549.1.1.5

Key: Sun RSA public key, 2048 bits
modulus: XXXXXXXXXXXXXXXXXXXxxxx
public exponent: 65537
Validity: [From: Thu May 12 14:47:31 CDT 2016,
To: Wed May 13 14:47:31 CDT 2026]
Issuer: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
SerialNumber: [ c8]

  • And prints *** ServerHelloDone.
  • Now the client has to validate this certificate against the cert that present in its trust store. So client side you should see something like

***
Found trusted certificate:
[
[

If you dont have right certificate or CA in trust store in client side or if trust store if not correctly configured,  you will see error at this point in the log like – unable to find valid certification path

  • If certificate is found in client key store, client server exchange the symmetric key (based on cipher suite used) that will be used for any further communication.

 

 

Java One Session Notes

I recently attend java one (2016). My notes from that.

  • Java Technology and Ecosystem: Everything You Wanted to Know
    • Openjdk and oracle jdk 0 should be be able to use interchangable.
    • Prefer binary message format over json\xml
  • Microservices
    • N\w latency is never zero.
    • Containerize  MS as much as possible.
    • http://jav.mn/dockerjee
    • Container – standardize interface between dev and ops.
    • https://github.com/eldermoraes/javaoneus2016
    • Jelastic Docker- can migrate the container with its internal state
    • History – Monolith – to maximize the hardware and software cost
    • Microservices Minus the Hype: How to Build and Why [CON6462] – PPT greatone.
    • Api first mindset.
    • Playdoh principle.
    • 12 factor manifesto.
  • Modularization (Java Jigsaw)
    • Java.base will be base
    • Only technology implemented in one module
    • Java –list-modules
    • Java –list-modules java.base
    • Jlink – links set of modules and creates runtime.
    • Dependency and readability graph.
    • Public != accessible
    • Classpath and module path can be combined.
  • The Developers Guide to Performance Tunning.
    • Data-lite performance testing anti-pattern.
    • Even with production volume data is not sufficient, one needs to simulate exact production input.
    • Find out the dominating consumer of CPU – Appl, JVM, Kernel, nothing dominating.
    • The Diabolical Developer’s Guide to Performance Tuning
  • Performance Tuning and How to Upscale to Analyze in a Cluster Deployment
    • Use IBM Health Center (completely open source)
    • Works with IBM, open jdk. With oracle JDK does not give profiling info.
    • JMX – one connection per client
    • MQTT – multiple connection per client
    • Symptoms- Lower throughput, higher than expected to CPU, Application unresponsive.
    • Bottlenecks – GC, I\O, CPU, Resource contention – Locking.
    • Post mortem Tools – GC logs, Appilcation logs, heap dump, thread dump.
    • Live monitoring
    • HealthCenter (IBM tool) avialable in IBM Marketplace. (native agent)
    • Cloudbased – ELK stack
    • https://github.com/RuntimeTools/
  • Supercharge Your Java Code to Get Optimal Database Performance
    • Performance = latency * throughput
    • Latency = time to process UoW
    • Thoughput = amount of UoW can be execute in parallel
    • Commit only when needed.
    • Batching of transaction. Row by Row = Slow by Slow
    • Put the logic where it makes sense
    • Does not rollback entire batch – Insert into test … log errors;
    • Pls use bind variable
    • https://github.com/gvenzl/Oracle-JavaOneSuperchargeCodeOptimalDBPerf
  • Threaddump
    • 8 way to take threadump – jstack, kill -3, Visualvm, JMC, Windows Ctl+break, ThreadMXBean, APMTools, jcmd
    • https://Blog.fasthread.io
    • Causes – Garbage collection or Infinite loop
    • 8 Type of OutofMemory error
    • Java Heap Space
    • Thread mill pattern – Creating new thread
    • Leprechaun pattern – Blocking the finalizer thread
    • RSI – Repetitive Stress Injury pattern – Creating unlimited thread
    • Traffic Jam pattern – Synchronized method.
  • Java 8
    • Faster than previous version – Fork\join, ParallelStream
    • Lambda – Fewer line of code.
    • New Solution, Datetime api.
    • Minimize error @optional
    • Automated testing important – to make sure nothing breaks
    • Performance test – impact refactoring on performance.
    • https://github.com/mongodb/morphia
    • http://bit.ly/refJ8
  • Hystrix
    • to track the nake call made without hystrix
    • Clean any unnecessary calls
    • Build dashboard around the agent log to track outgoing calls.
    • Everything outgoing is around with Hystrix.
    • Hystrix Audit agent –
    • Mindset change – code for resiliency
    • Unit test and integration with Wiremock
    • Hsytrix metric stream.
    • Turbine – aggregate all stream in one client.
    • Hystrix Dashboard – D3 dashboard.
    • Netflix Altas – Analyzing historic data.
    • Not appcicable to batch and streaming usecase.
    • https://github.com/billyy/Hystrix-Tutorial
  • Art of backward compatibility
    • Command line serialver argument.
    • Ignore unknown, define default value – careful about this.
    • Semantic Compatibility
    • Object serialization bypasses constructor.
    • Custom serialization \deserialization
    • Bridge method injection
  • HTTP2 api
    • Each request has its own stream.
    • Each stream has its own flow.
    • Can reset single stream wiithout impacting others.
    • Server push
    • Prioritize of request.
    • Header Compress (HPACK)
    • Binary exchange protocol
    • Completable Future Api
  • CDI
    • Java CDI has more to offer over above spring, juice, seam