Defining Sub-Generations of the Millennials (In preparation to discuss High Performance Workspaces)

I’m planning on doing a blog entry on “High Performance Workspaces” and how corporations are saying these are the types of work environments that attract Millennials. I personally highly dislike these so called “High Performance Workspaces”. But in preparation for that video, I want to define the differences in the different age ranges of the Millennial Generation, because I have observed that they are not just one homogeneous group, indeed, I believe their are at least 3 distinct sub-generations or sub-groups within their generation.

Background Info for the chart I created below:

Date Ranges for both Generation X and Millennials have been pulled from Gallup’s definition of the two generations, which are:

Generation X Birth Date Range: 1965 to 1979

Millennial Birth Date Range: 1980 to 1996

So first a little about myself, just to frame the conversation. I’m part of the Tail-End of Generation X, sometimes called the Xennials.

I have broken down the Millennial Generation into 3 separate sub-generational groups, simply named Early, Mid, and Late.

Early Millennials I define having birth date ranges between 1980 and 1985.

Mid Millennials I define between 1986 and 1992.

Finally, I define Late Millennials between 1993 and 1996.

My personal feeling, based on observations I make of my younger cousins and their peers is that the Millennial birth date range should extend to somewhere between 1999 and 2001, however we’ll stick to the Gallup definition.

At the time of this writing I attribute the following observations to Early, Mid, and Late Millennials:

Early Millennial Attributes: In the process of buying their first homes, or are in their first homes, after renting, I find these millennials, moved out of the parents as soon as possible after college. Most are married, some with 1 child and perhaps another on the way. They are in the late stages of the beginning of their career or have settled into their career. While they have adapted to social media as a large part of their social life, they grew up through their teenage years, high school, and even most or all of college without social media, and therefore lean towards real-world social interaction more than their mid and late millennial counterparts. They can balance real-world social interaction with social media interaction equally well. When speaking with them verses an Late Generation Xer, there is almost no real difference. They also fall under the Xennials moniker of the mixed stage of Generation X and Millennials.

Mid Millennial Attributes: Still living at home or perhaps renting, usually with a roommate or perhaps they live together with their significant other, but not married yet. They have only begun their career but many still are working jobs just to make some money to live, such as a Starbucks Barista or other low-wage job. They are still questioning what they really want to do with their life. Where they are with their career search highly depends on what they studied in college. If it was a hard science degree or professional degree, they are probably already in the early stages of their career perhaps already settled into their careers like their Early Millennial counterparts. But if they have a social science or other liberal arts degree, even something like a generic Business degree they may be highly under-employed as their degrees ill-equipped them for the corporate world. These are the Millennials into ideas like Tiny Houses and are trying their best to live out the idea that they are special individuals and they can’t be tied down to a “traditional cubical type job”. This group may have older siblings in the Late Gen-X or Early Millennial range, and if they do not fall into the Millennials that can afford to take care of themselves without much of their parents help, they are constantly compared to their older brothers and sisters or cousins and friends. It creates a hostile environment around the Holidays, and a lot of competition. Usually their older siblings will ignore the competition as they realize there really isn’t any competition to be had. so these Millennials are in actuality competing against their own sense of self. These Millennials grew up with early generation web sites that grew into what we know today as social media. Throughout high school and college, they were using things like Instant Messenger and Chat Rooms, and you start to see the breakdown of where it’s almost easier for this part of the millennial generation to communicate online. Were their old counterparts in the Early Millennial sub-group probably met most of their significant others in the real-world, probably post high school these mid millennials have been using dating web sites for most of their adult life’s to find companionship.

Late Millennials Attributes: They are either still in college or have just graduated, perhaps they are going straight on to graduate school. Most of them are moving back home with their parents right after graduation without job prospects. Again this is highly dependent on their degrees, but also how they spent their summers during college. If they achieved an internship, they are more likely then their peers to be able to find employment that will lead to a career and one that will allow them to support themselves without the help of their parents. The rest will follow the mid-millennials to under-employment type jobs like Baristas or other retail or low wage service jobs. They will be forced to work because their parents tell them they have to do something with their life, and probably complain a lot that their parents “just don’t understand them”. They are still seeking out ways to display their so called “uniqueness” to the world, because they have been told they are special their entire lives and can do “anything they want”. However it’s interesting now, once they return home, their parents are usually on their case asking them, when will they move out of the house and get a real job, etc. They are completely consumed by social media, and many prefer texting or chatting online over real-world conversation. To find them without a smart phone in their hand is like trying to spot a unicorn. Simply it does not exist. Dating to this sub-group is almost exclusively assumed to happen online unless they met someone in college. Even at that point, many for whatever reason still date people via online platforms. Like their older Mid Millennial friends and family, they find themselves in a competition against themselves as their direct friends probably and many of their mid millennial counterparts are in the same situation as they are, and their older siblings in the Early Millennials or Late Gen-Xers simply don’t want to be bothered with their annoyances, they have their own life’s at this point to take care of, and they are quickly drifting further away from this range of Millennials in terms of social standing.

 

Year Born Age as of 2018 Millennial Sub-Generation Age at September 11th Attacks Age at Start of Housing Crisis 2008 Estimated Start of High School Year High School Graduation Year College Graduation Year
1980 38 Early 21 28 1994 1998 2002
1981 37 Early 20 27 1995 1999 2003
1982 36 Early 19 26 1996 2000 2004
1983 35 Early 18 25 1997 2001 2005
1984 34 Early 17 24 1998 2002 2006
1985 33 Early 16 23 1999 2003 2007
1986 32 Mid 15 22 2000 2004 2008
1987 31 Mid 14 21 2001 2005 2009
1988 30 Mid 13 20 2002 2006 2010
1989 29 Mid 12 19 2003 2007 2011
1990 28 Mid 11 18 2004 2008 2012
1991 27 Mid 10 17 2005 2009 2013
1992 26 Mid 9 16 2006 2010 2014
1993 25 Late 8 15 2007 2011 2015
1994 24 Late 7 14 2008 2012 2016
1995 23 Late 6 13 2009 2013 2017
1996 22 Late 5 12 2010 2014 2018

 

Posted in Society | 1 Comment

Master Map-Reduce Job – The One and Only ETL Map-Reduce Job you will ever have to write!

It’s fitting that my first article on Big Data would be titled the “Master Map-Reduce Job”. I believe it truly is the one and only Map-Reduce job you will every have to write, at least for ETL (Extract, Transform and Load) Processes. I have been working with Big Data and specifically with Hadoop for about two years now and I achieved my Cloudera Certified Developer for Apache Hadoop (CCDH) almost a year ago at the writing of this post.

So what is the Master Map-Reduce Job? Well it is a concept I started to architect that would become a framework level Map-Reduce job implementation that by itself is not a complete job, but uses Dependency Injection AKA a Plugin like framework to configure a Map-Reduce Job specifically for ETL Load processes.

Like most frameworks, you can write your process without them, however what the Master Map-Reduce Job (MMRJ) does is break down certain critical sections of the standard Map-Reduce job program into plugins that are named more specific to ETL processing, so it makes the jump from non-Hadoop based ETL to Hadoop based ETL easier for non-Hadoop-initiated developers.

I think this job is also extremely useful for the Map-Reduce pro who is implementing ETL jobs, or groups of ETL developers that want to create consistent Map-Reduce based loaders, and that’s the real point of the MMRJ. To create a framework for developers to use that will enable them to create robust, consistent, and easily maintainable Map-Reduce based loaders. It follows my SFEMS – Stable, Flexible, Extensible, Maintainable, Scalable development philosophy.

The point of the Master Map Reduce concept framework is to breaks down the Driver, Mapper, and Reducer into parts that non-Hadoop/Map-Reduce programmers are well familiar with; especially in the ETL world. It is easy for Java developers who build Loaders for a living to understand vocabulary like Validator, Transformer, Parser, OutputFormatter, etc. They can focus on writing business specific logic and they do not have to worry about the finer points of Map-Reduce.

As a manager you can now hire a single senior Hadoop/Map-Reduce developer and hire normal core Java developers for the rest of your team or better yet reuse your existing team and you can have the one senior Hadoop developer maintain your version of the Master Map-Reduce Job framework code, and the rest of your developers focus on developing feed level loader processes using the framework. In the end all developers can learn Map-Reduce, but you do not need to know Map-Reduce to get started writing loaders that will work on the Hadoop cluster by using this framework.

The design is simple and can be show by this one diagram:

Master_Map-Reduce_Job_Diagram

One of the core concepts that separates the Master Map-Reduce Job Conceptual Framework from a normal Map-Reduce Job, is how the Mapper and Reducer are structured and the logic that normally would be written directly in the map and reduce functions are now externalized into classes that use vocabulary that is natively familiar to ETL Java Developers, such as Validator, Parser, Transformer, Output Formatter. It is this externalization that simplifies the ETL job Map-Reduce development. I believe that what confuses developers about how to make Map-Reduce jobs work as robust ETL processes is that it’s too low level. You take a look at a map function and a reduce function, and a developer who does not have experience with writing complex map-reduce jobs, will take one look and say it’s too low level and perhaps even I’m not sure exactly what they expect me to do with this. Developers can be quickly turned off by the raw low level interface, although tremendously power that Map-Reduce exposes.

It is this code below that makes the most valuable architectural asset of the framework. The fact that we in the Master Map-Reduce Job Conceptual Framework have broken down the map method of the Mapper class into a very simple process flow of FIVE steps that will make sense to any ETL Developer. Please read through the comments, for each step. Also note that the same thing is done for the Reducer, but only the Transform and Output Formatter are used.

Map Function turn into a ETL Process Goldmine:


@Override

  public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException {

    String record;

    String[] fields;

    try {

      //First validate the record

      record = value.toString();

      if (validator.validateRecord(record)) {

        //Second Parse valid records into fields

        fields = (String[]) parser.parse(record);

        //Third validate individual tokens or fields

        if (validator.validateFields(fields)) {

          //Fourth run transformation logic

          fields = (String[]) transformer.runMapSideTransform(fields);

          //Fifth output transformed records

          outputFormatter.writeMapSideFormat(key, fields, output);

        }

        else {

          //One or more fields are invalid!

          //For now just record that

          reporter.getCounter(MasterMapReduceCounters.VALIDATION_FAILED_RECORD_CNT).increment(1);

        }

      } //End if validator.validateRecord 

      else {

        //Record is invalid!

        //For now just record, but perhaps more logic

        //to stop the loader if a threshold is reached

        reporter.getCounter(MasterMapReduceCounters.VALIDATION_FAILED_RECORD_CNT).increment(1);

      }

    } //End try block

    catch (MasterMapReduceException e) {

      throw new IOException(e);

    }

  }

Source Code for the Master Map-Reduce Concept Framework:

The source code here should be considered a work in progress. I make no statements to if this actually works, nor has it been stress tested in anyway, and should only be used as a reference. Do not use it directly in mission critical or production applications.

All Code on this page is released under the following open source license:

Copyright 2016 Robert C. Ilardi
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

MasterMapReduceDriver.java – This class is a generic Map-Reduce Driver program, which makes use of two classes from the MasterMapReduce concept framework, which are the “MasterMapReduceConfigDao” and “PluginController”. Both are responsible for returning configuration data to the MasterMapReduceDriver, as well as (we will see later on) the Master Mapper and Master Reducer. The MasterMapReduceConfigDao, is a standard Data Access Object implementation that wraps data access to HBase, where configuration tables are created that make use of a “Feed Name” as the row keys, and have various columns that represent class names, or other configuration information such as Job Name, Reducer Task number, etc. The PluginController is a higher level wrapper around the DAO itself, whereas the DAO is responsible for low level data access to HBase, the PluginController, does the class creation and other high level functions that make use of the data returned by the DAO. We do not present the implementations for the DAO or the PluginController here because they are simple PoJos that you should implement based on your configuration strategy. Instead of HBase for example, it can be done via a set of plain text files on HDFS or even the local file system.

The Master Map Reduce Driver is responsible for setting up the Map-Reduce Job just like any other standard Map-Reduce Driver. The main difference is that it has been written to make use the Plugin architecture to configure the job’s parameters dynamically.

/**

 * Created Feb 1, 2016

 */


package com.roguelogic.mrloader;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.conf.Configured;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.mapred.FileInputFormat;

import org.apache.hadoop.mapred.FileOutputFormat;

import org.apache.hadoop.mapred.JobClient;

import org.apache.hadoop.mapred.JobConf;

import org.apache.hadoop.mapred.RunningJob;

import org.apache.hadoop.util.Tool;

import org.apache.hadoop.util.ToolRunner;

/**

 * @author Robert C. Ilardi

 *

 */

public class MasterMapReduceDriver extends Configured implements Tool {

  public static final String MMR_FEED_NAME = "RL.MasterMapReduce.FeedName";

  private MasterMapReduceConfigDao confDao;

  private PluginController pluginController;

  private String feedName;

  private String mmrJobName;

  private String inputPath;

  private String outputPath;

  public MasterMapReduceDriver() {

    super();

  }

  public synchronized void init(String feedName) {

    System.out.println("Initializing MasterMapReduce Driver for Feed Name: " + feedName);

    this.feedName = feedName;

    //Create MMR Configuration DAO (Data Access Object)

    confDao = new MasterMapReduceConfigDao();

    confDao.init(feedName); //Initialize Config DAO for specific Feed Name

    //Read Driver Level Properties

    mmrJobName = confDao.getLoaderJobNameByFeedName();

    inputPath = confDao.getLoaderJobInputPath();

    outputPath = confDao.getLoaderJobOutputPath();

    //Configure MMR Plugin Controller

    pluginController = new PluginController();

    pluginController.setConfigurationDao(confDao);

    pluginController.init();

  }

  @Override

  public int run(String[] args) throws Exception {

    JobConf jConf;

    Configuration conf;

    int res;

    conf = getConf();

    jConf = new JobConf(conf, this.getClass());

    jConf.setJarByClass(this.getClass());

    //Set some shared parameters to send to Mapper and Reducer

    jConf.set(MMR_FEED_NAME, feedName);

    configureBaseMapReduceComponents(jConf);

    configureBaseMapReduceOutputFormat(jConf);

    configureBaseMapReduceInputFormat(jConf);

    res = startMapReduceJob(jConf);

    return res;

  }

  private void configureBaseMapReduceInputFormat(JobConf jConf) {

    Class clazz;

    clazz = pluginController.getInputFormat();

    jConf.setInputFormat(clazz);

    FileInputFormat.setInputPaths(jConf, new Path(inputPath));

  }

  private void configureBaseMapReduceOutputFormat(JobConf jConf) {

    Class clazz;

    clazz = pluginController.getOutputKey();

    jConf.setOutputKeyClass(clazz);

    clazz = pluginController.getOutputValue();

    jConf.setOutputValueClass(clazz);

    clazz = pluginController.getOutputFormat();

    jConf.setOutputFormat(clazz);

    FileOutputFormat.setOutputPath(jConf, new Path(outputPath));

  }

  private void configureBaseMapReduceComponents(JobConf jConf) {

    Class clazz;

    int cnt;

    //Set Mapper Class

    clazz = pluginController.getMapper();

    jConf.setMapperClass(clazz);

    //Optionally Set Custom Reducer Class

    clazz = pluginController.getReducer();

    if (clazz != null) {

      jConf.setReducerClass(clazz);

    }

    //Optionally explicitly set number of reducers if available

    if (pluginController.hasExplicitReducerCount()) {

      cnt = pluginController.getReducerCount();

      jConf.setNumReduceTasks(cnt);

    }

    //Set Partitioner Class if a custom one is required for this Job

    clazz = pluginController.getPartitioner();

    if (clazz != null) {

      jConf.setPartitionerClass(clazz);

    }

    //Set Combiner Class if a custom one is required for this Job

    clazz = pluginController.getCombiner();

    if (clazz != null) {

      jConf.setCombinerClass(clazz);

    }

  }

  private int startMapReduceJob(JobConf jConf) throws IOException {

    int res;

    RunningJob job;

    job = JobClient.runJob(jConf);

    res = 0;

    return res;

  }

  public static void main(String[] args) {

    int exitCd;

    MasterMapReduceDriver mmrDriver;

    Configuration conf;

    String feedName;

    if (args.length < 1) {

      exitCd = 1;

      System.err.println("Usage: java " + MasterMapReduceDriver.class + " [FEED_NAME]");

    }

    else {

      try {

        feedName = args[0];

        conf = new Configuration();

        mmrDriver = new MasterMapReduceDriver();

        mmrDriver.init(feedName);

        exitCd = ToolRunner.run(conf, mmrDriver, args);

      } //End try block

      catch (Exception e) {

        exitCd = 1;

        e.printStackTrace();

      }

    }

    System.exit(exitCd);

  }

}


Code Formatted by ToGoTutor

BaseMasterMapper.java – This class is an abstract base class that implements the configure method of the Mapper implementation, to make use of the DAO and PluginController already described above. It should be extended by all your Mapper implementations you use when creating a Map-Reduce job using the Master Map Reduce concept framework. In the future we might create additional helper functions in this class for the mappers to use. In the end you only need a finite number of Mapper implementations. It is envisioned that the number of mappers is related more to the number of file formats you have, not the number of feeds. The idea of the framework is not to have to write the lower level components of a Map-Reduce job at the feed level, and instead developers should focus on the business logic such as Validation logic and Transformation logic. The fact that this logic runs in a Map-Reduce job is simply because it needs to run on the Hadoop cluster. Otherwise these loader jobs execute logic like any other standard Loader job running outside of the Hadoop cluster.

/**

 * Created Feb 1, 2016

 */


package com.roguelogic.mrloader;

import org.apache.hadoop.mapred.JobConf;

import org.apache.hadoop.mapred.MapReduceBase;

/**

 * @author Robert C. Ilardi

 *

 */

public abstract class BaseMasterMapper extends MapReduceBase {

  protected String feedName;

  protected MasterMapReduceConfigDao confDao;

  protected PluginController pluginController;

  protected Validator validator; //Used to validate Records and Fields

  protected Parser parser; //Used to parse records into fields

  protected Transformer transformer; //Used to run transformation logic on fields

  protected OutputFormatter outputFormatter; //Used to write out formatted records

  public BaseMasterMapper() {

    super();

  }

  @Override

  public void configure(JobConf conf) {

    feedName = conf.get(MasterMapReduceDriver.MMR_FEED_NAME);

    confDao = new MasterMapReduceConfigDao();

    confDao.init(feedName);

    pluginController = new PluginController();

    pluginController.setConfigurationDao(confDao);

    pluginController.init();

    validator = pluginController.getValidator();

    parser = pluginController.getParser();

    transformer = pluginController.getTransformer();

    outputFormatter = pluginController.getOutputFormatter();

  }

}


Code Formatted by ToGoTutor

BaseMasterReducer.java – Just like on the Mapper side, this class is the base class for all Reducers implementations that are used with the Master Map-Reduce Job framework. Like the BaseMasterMapper class it implements the configure method and provides access to the DAO and PluginController for reducer implementations. Again in the future we may expand this to include additional helper functions.

/**

 * Created Feb 1, 2016

 */


package com.roguelogic.mrloader;

import org.apache.hadoop.mapred.JobConf;

import org.apache.hadoop.mapred.MapReduceBase;

/**

 * @author Robert C. Ilardi

 *

 */

public abstract class BaseMasterReducer extends MapReduceBase {

  protected String feedName;

  protected MasterMapReduceConfigDao confDao;

  protected PluginController pluginController;

  protected Transformer transformer; //Used to run transformation logic on fields

  protected OutputFormatter outputFormatter; //Used to write out formatted records

  public BaseMasterReducer() {

    super();

  }

  @Override

  public void configure(JobConf conf) {

    feedName = conf.get(MasterMapReduceDriver.MMR_FEED_NAME);

    confDao = new MasterMapReduceConfigDao();

    confDao.init(feedName);

    pluginController = new PluginController();

    pluginController.setConfigurationDao(confDao);

    pluginController.init();

    transformer = pluginController.getTransformer();

    outputFormatter = pluginController.getOutputFormatter();

  }

}


Code Formatted by ToGoTutor

StringRecordMasterMapper.java – This is a example implementation of what a Master Mapper implementation would look like. Note that it has nothing to do with the Feed, instead it is related to the file format. Specifically this class would make sense as a mapper for a delimited text file format.


/**

 * Created Feb 1, 2016

 */


package com.roguelogic.mrloader;

import java.io.IOException;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapred.Mapper;

import org.apache.hadoop.mapred.OutputCollector;

import org.apache.hadoop.mapred.Reporter;

/**

 * @author Robert C. Ilardi

 *

 */

public class StringRecordMasterMapper extends BaseMasterMapper implements Mapper {

  public StringRecordMasterMapper() {

    super();

  }

  @Override

  public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException {

    String record;

    String[] fields;

    try {

      //First validate the record

      record = value.toString();

      if (validator.validateRecord(record)) {

        //Second Parse valid records into fields

        fields = (String[]) parser.parse(record);

        //Third validate individual tokens or fields

        if (validator.validateFields(fields)) {

          //Fourth run transformation logic

          fields = (String[]) transformer.runMapSideTransform(fields);

          //Fifth output transformed records

          outputFormatter.writeMapSideFormat(key, fields, output);

        }

        else {

          //One or more fields are invalid!

          //For now just record that

          reporter.getCounter(MasterMapReduceCounters.VALIDATION_FAILED_RECORD_CNT).increment(1);

        }

      } //End if validator.validateRecord 

      else {

        //Record is invalid!

        //For now just record, but perhaps more logic

        //to stop the loader if a threshold is reached

        reporter.getCounter(MasterMapReduceCounters.VALIDATION_FAILED_RECORD_CNT).increment(1);

      }

    } //End try block

    catch (MasterMapReduceException e) {

      throw new IOException(e);

    }

  }

}


Code Formatted by ToGoTutor

StringRecordMasterReducer.java – This is an example implementation of what the Master Reducer would look like. It compliments the StringRecordMasterMapper from above, in that it works well with text line / delimited file formats. The idea here is that the Mapper parses and transforms raw feed data into a conical data model and outputs that transformed data in a similar delimited text file format. Most likely the Reducer implementation can simply be a pass through. It’s possible that a reducer in this case is not even needed, and we can configure the Master Map Reduce Driver to be a Map-Only job.


/**

 * Created Feb 1, 2016

 */


package com.roguelogic.mrloader;

import java.io.IOException;

import java.util.Iterator;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.NullWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapred.OutputCollector;

import org.apache.hadoop.mapred.Reducer;

import org.apache.hadoop.mapred.Reporter;

/**

 * @author Robert C. Ilardi

 *

 */

public class StringRecordMasterReducer extends BaseMasterReducer implements Reducer {

  public StringRecordMasterReducer() {

    super();

  }

  @Override

  public void reduce(LongWritable key, Iterator values, OutputCollector output, Reporter reporter) throws IOException {

    String data;

    Text txt;

    try {

      while (values.hasNext()) {

        txt = values.next();

        data = txt.toString();

        //First run transformation logic

        data = (String) transformer.runReduceSideTransform(data);

        //Second output transformed records

        outputFormatter.writeReduceSideFormat(data, output);

      } //End while (values.hasNext()) 

    } //End try block

    catch (MasterMapReduceException e) {

      throw new IOException(e);

    }

  }

}


Code Formatted by ToGoTutor

Conclusion

In the end, some make ask how much value those a framework like this add? Isn’t Map-Reduce simple enough? Well the truth is, we need to ask this for all frameworks and wrappers we use. Are their inclusion worth it? I think in this case the Master Map Reduce framework does add value. It breaks down the Driver, Mapper, and Reducer into parts that non-Hadoop/Map-Reduce programmers are well familiar with; especially in the ETL world. It is easy for Java developers who build Loaders for a living to understand vocabulary like Validator, Transformer, Parser, OutputFormatter, etc. They can focus on writing business specific logic and they do not have to worry about the finer points of Map-Reduce. Combine this with the fact that this framework creates an environment where you can create hundreds of Map-Reduce programs, one for each feed you are loading, and each program will have the exact same Map-Reduce structure, I believe this framework is well worth it.

Just Another Stream of Random Bits…
– Robert C. Ilardi
Posted in Big Data, Development | Leave a comment

Synthetic Transactions and Capability Monitoring of your Enterprise Architecture

Back in my days at Lehman Brothers, I was introduced to the concept of “Synthetic Transactions”. That is an automated action that is scheduled to execute periodically to monitor performance and availability of one of more components in your enterprise architecture.

Most architects will use SNMP, and simple pinging of servers, routers, networks, etc, and monitoring things like Disk Space, CPU Usage and Memory Usage. Pretty much anything that can be recorded via HP OpenView / HP BTO (Business Technology Optimization) I believe this is ok for infrastructure monitoring, but for application monitor, which I believe gives you a better view into the health of your Enterprise Architecture, that matters to the real users and clients, Synthetic Transactions are far more superior.

Synthetic Transactions go further than simple network or infrastructure monitoring and it goes further than even simple application performance metrics monitoring with say a tool like ITRS’s Geneos. A Synthetic Transaction is really about testing the capabilities of your systems and applications from the view point of a end user or a calling client system, to ensure that the system is available with the capabilities and performance profile agree upon by the contract set in your requirements.

Synthetic Transactions are not always easy to implement, and great care must be put into planning the inclusion of Synthetic Transactions from the beginning of system design and architecture analysis and should be part of Non-Functional Requirements.

Also in terms of Information Security, and Intrusion Detection, Synthetic Transactions are a way to start implementing the next phase of network defenses. As you all know in today’s world, firewalls are no longer sufficient to keep the hackers out of your systems. More and more hackers have already turned to attacking specific application weaknesses instead of going after the raw network infrastructure as the infrastructure was the first and easiest way for organizations to shore up their security.

While Synthetic Transactions won’t prevent cyber attacks, or increase security by themselves, the detailed level component monitoring and performance metrics collection that Synthetic Transactions provide can potentially help identify applications or components of applications that are under attack or have been compromised due to potential performance or application behavioral issues caused by hackers attacking your applications.

Microsoft has a good outline of what a Synthetic Transaction is, although they related it to their Operations Manager product, the general information is valid regardless if you use a tool or develop your own Synthetic Transaction Agents. Specifically Microsoft states in this article: “Synthetic transactions are actions, run in real time, that are performed on monitored objects. You can use synthetic transactions to measure the performance of a monitored object and to see how Operations Manager reacts when synthetic stress is placed on your monitoring settings. For example, for a Web site, you can create a synthetic transaction that performs the actions of a customer connecting to the site and browsing through its pages. For databases, you can create transactions that connect to the database. You can then schedule these actions to occur at regular intervals to see how the database or Web site reacts and to see whether your monitoring settings, such as alerts and notifications, also react as expected.”

Another good definition however more of just a summary than what Microsoft outlined, is available on Wikipedia in the Operational Intelligence article, specifically the section on System Monitoring where they state: “Capability monitoring usually refers to synthetic transactions where user activity is mimicked by a special software program, and the responses received are checked for correctness.”

Although, Wikipedia does not have a lot of direct information about Synthetic Transactions, I do like their term “Capability Monitoring”, which is exactly what Synthetic Transactions attempts to do, monitor the capabilities of your system at any given moment, to give you, your developers and your operations support staff a dashboard level view into how your system is performing and what components are available and their through the performance measures, what is the health of each of your system’s components and therefore the overall health of your system and applications.

Back at Lehman, and if you look at the Microsoft description, most times a Synthetic Transaction focuses on a single aspect of the System; for example, checking if you are able to open a connection to a database. While this is a valid Synthetic Transaction, it is extremely simple, and may not provide you with enough information to tell if you application is actually available from an end user or client system standpoint.

What I developed as a model for Synthetic Transactions back in 2006, was they ability for my Transaction to interact with multiple-tiers of my architecture, if not all tiers.

The application which I was developing Synthetic Transactions for was a Reference Data system that included a Desktop and Web base Front Ends, a JavaEE (J2EE at the time) based Middleware, a Relational Database, a Workflow Engine, and a Message Publisher, among other various supporting components such as ETL processes, and other batch processing.

The most useful test in this case would be one that touched the Middleware, interacted with the workflow engine, retrieved data from the database and potentially updated test records, and had those test messages published and received by the Synthetic Transaction Agent to verify the full flow of the system.

Creating the Agent:

To create the Agent that would initiate the Transactions, I used a Job schedule such as Autosys or Control-M to schedule the process to kick off every couple of hours to collect metrics (Since the application was a global app used 24 x 7, it was important that the application was not only available but was performant around the clock, and we needed to be alerted if the application was performing out of an acceptable range, and which component was affected).

The Agent itself was a client of the middleware. Since all services such as the Database and the Workflow Engine were wrapped by the middleware, we could have the agent invoke different APIs that would perform a Database Search and record metrics, and call an API that would create a Workflow request, and move it automatically through the workflow steps.

At the end of the workflow, we were able to trigger the messaging publisher to broadcast a message. Since our Data Model allowed for Test records, and we built into our requirements that consumers generally filter out or otherwise ignore Test records in the message flow, we were able to send out test messages in the production environment that would not affect any of our downstream clients.

However, our Agent process could start up a message listener and listen for test records specifically. The Agent then by recording the start time of the workflow transaction to the receive time of the test record message, could calculate the round trip time of data flowing through the system.

Each individual API call from invocation to return can also be timed to test how each different API was performing.

In terms of ETL, since the Data Model again allowed for test records, we were able to create a small file of test records and trigger the ETL process as well to load the test records. The records in the database would be updated, in some cases with just a timestamp update, but it would still be a valid test, and valid metrics can still be collected.

Together this gave us good dashboard view of the system’s availability and performance at a given time. If we wanted to increase the resolution all we had to do was decrease the period between each job start of the Agents.

We recorded the metrics in a database table, and created a simple web page, which production support teams could use to monitor the Synthetic Transactions and their reported metrics.

On a side note: If your APIs and libraries are written in Java, and already record metrics that your developers used for debugging, and Unit Testing, you can expose these directly via JMX, which can be accessed and used directly if your Synthetic Transaction Agent process(es) are also written in Java. Or you can create a separate function or API that returns the internal metrics recorded by your libraries, frameworks and API deployments.

A number of years ago, I developed a Performance Metrics object model and small set of helper functions for Java that I have been using for over a decade and I find that even today they are still the most useful performance metrics I can collect. Perhaps I will write up an article on collecting performance metrics in the applications you develop and share that simple object model and helper functions.

Automated alerts, such as paging the on call support staff could also be accomplished by simply specifying how many seconds or milliseconds a call to an API should take, and if that period is exceeded, the Agent would send out emails and paging alerts.

In the end a lot of organizations have a Global Technology and Architecture Principal that mandates all their applications have some sort of automated system testing.

This can be accomplished by using the Synthetic Transaction paradigm.

It is worth noting that creating an architecture that supports Synthetic Transactions is not simply. You need to ensure that all components, especially your data and information models allow for test records.

A way around the information model requirement is to Rollback all transactions on your database instead of committing them. This would force you to have a flag or special API separate from the normal data flow in your system to ensure data is not permanently written to your database. However, the issue here, is if you implement it this way, you cannot have a true end to end flow in production of test records. Still you will be able to get most of the metrics you need.

Also if you organization only mandates a certain level of automated testing or performance and availability monitoring, than perhaps true end to end data flow through your system is not required.

It is my experience however, that even if my company I work for does not mandate true end to end testing, as a responsible application owner, I prefer to have the capabilities to have true end to end data flow testing available to me, so I can monitor my systems more accurately and give proper answers to stakeholders when users and client systems complain about performance or system availability.

Just Another Stream of Random Bits…
– Robert C. Ilardi
 
 
Posted in Architecture | Leave a comment

Lightweight User Reference Object for Securing APIs

Back in 2005, I was face with developing a Secure Set of APIs that could run in multiple deployment configurations. At the time we were heavily developing EJB’s, specifically Stateless Session Beans. We were also starting to deploy SOAP based Web Services, and we were also packaging these same APIs in the form of standalone Libraries.

On a side note this will be my first article on Information Security Topics and developing Secure Applications. I recently have become increasingly interested in Penetration Testing and other Information Security topics, and I am even enrolled in classes and other forms of training. I have created the Security Category on my blog to organize security related topics on this web site. Hope they will help all of us create more secure applications.

Combined with what I call the Data Services Architecture and the Resource Bundle / App Resource Manager framework, I was able to create an architecture leveraging Factories, Mediators, Data Access Objects, and Facades to hide from the calling clients which “Mode” the APIs were running in, whether it was EJB, Web Services, or simply running from a Locally deployed Library on the classpath.

I was faced with the challenge of ensuring that no matter which operating or deployment mode these APIs which numbers in the hundreds of individual API methods, were all secure. Not only did a calling application or user have to Authenticate with a Single Sign On Services provided by the firm, I also needed to create an Entitlements framework that would allow fine grain, down to the individual method level Authorizations for each API.

As any good Developer that has any exposure to basic Information Security and Defensive Programming Techniques, this means that we only want to login once, so that we do not have to pass credentials to each API we call, and in doing so the established design for doing so is the assignment of a Securely Randomized Unguessable Session ID. This ID does NOT have to be the HTTP Session ID, which in the case of my requirements was only available technically when developing the Web Services.

Also, depending on your Application Server configuration and firm standards, you probably are running on a multi node cluster and some load balancers do not work very friendly with HTTP Session Replication and again depending on firm development standards they make not even allow you to turn on Session Replication. And some may even have a requirement NOT to turn on session stickiness.

My solution was to develop two components, one is called the Stateless User Cache, which is responsible for creating and management Sessions across Clusters of Application Servers without App Server Session Replication, and also allows for the use of this Stateless User Cache to operate correctly in Standalone locally classpath deployed environments such as Library Mode.

We will go over Stateless User Cache in a future blog article in more detail, but I wanted to mention it hear, because it is tied to the Lightweight User Reference Object.

So basically I provide an API usually called ssoLogin which wraps the firms Single Sign On Service, whether its something like authenticating against LDAP or Active Directory, or something like a vendor product such as Site Minder.

The ssoLogin method will NOT return a User object which contains all entitlements, but instead will leverage the Stateless User Cache to create a new “Session” store the User object in that session, and return a “Reference” or “Pointer” object to that session.

In this case you can thing of it as an Object form of a Session ID.

The Class looks something like this:

public class UserRef implements Serializable

{

private String sessionId;

private long loginTimestamp;

private long lastTouchTimestamp;

private String userId; //Insecure if the user id is Private, see notes below.

//Getter and Setter methods…

//HashCode and Equals methods…

}

Basically as you can see the UserRef object provides 3 to 4 bits of information. The fourth, being the userId, can be the username or a unique surrogate key or even better a transient key that does not map to the real database stored user id.

However it can be the real username or surrogate key depending on the application security requirements. Let’s take for example the case of a Instant Messaging Application. The Username is public information an it makes sense for the client to have a list of usernames the currently login user has on their buddy or friends or contact list. In this case there is no real secure issue for storing the username in this field because it is public shared information.

However in applications where usernames and ids are not required or never needed to be shared, that we should leave this field Null, or remove it from the UserRef object itself.

One advantage to having the userId in the UserRef is if you have the same user or application logging in more than once, and you want to tie together different Session Ids to the same user, and for whatever requirement you have, it is needed by the client to be able to lookup the other sessions or in some way communicate with the other sessions.

Now as a side note, technically this user id whether real or transient and secure generated and mapped on the server side to the real underlying user id, does not need to be send back to the client. The unique session id is good enough, and you can store the user id for same user owned session ids on the server side, which is much more secure, but I have found by experience that in my enterprise applications sometimes I need to expose the username or user id to the client side, and I usually do this through UserRef. Again you need to perform Security Use Cases to determine if having this bit of information opens any vulnerabilities on your applications and any potential exploits can be created to take advantage of that vulnerability. One vulnerability this make open up is Username recon and collection and potential Spear Fishing attacks, or User Id enumeration if the Ids are insecurely generated such as simple sequence numbers.

In any case, the UserRef with at minimum the sessionId field is required, and the other information can be added or removed as you require for your applications, however the more the client side needs to do without communicating with the server, especially if the API suite is used by not only Web Applications but Desktop Applications, or perhaps Batch Applications and Server side Daemon Processes, the more information you may need to include in your lightweight User Reference Object.

The next step is to required all developers on your team to include the mandatory field UserRef in all their API methods as a required Parameter.

Than you can a combination of the Stateless User Cache if you have something similar to it, or the HTTP Session to use the UserRef object as the Key to lookup the full User object which contains User Entitlements.

In a future post I will do a write up on my Entitlements Object Model so you can see how I store Entitlements or Authorization information in memory.

Usually I will create methods such as public boolean hasAccess(UserRef userRef, String apiName) throws AppSecurityException;

And require all my developers to ensure in the Mediator or Facade code that hides all Data Access Objects and other Service Handler objects to firm check if the user has access to the method by ensuring they make a call to “hasAccess” first.

Its easy to do a code review or even write a script that automatically scans your source code to ensure every method has a call to hasAccess.

One important node is, of course the login method, in this case “ssoLogin” would normally be the only method that does not make a call to hasAccess as all users should have implicit access to this method, and even users that do not exist in your security databases or LDAP directories, will simply get a Login failed message.

Remember do not give potential hackers hints, if they guessed a username correctly. Instead use the generic Login Failure Message: “Username and/or Password are invalid.”

In this case the system does not give them a hint whether the username actual exists or if they simply got the password incorrect.

Finally the since the UserRef object is small, it has a smaller impact on I/O when transferring the object remotely via EJB or Web Services calls. Much smaller I/O footprint then passing the entire User object, which besides being highly insecure, also can be a performance issue.

Let me know what you think of my User Reference Object and solutions to securing APIs or for that matter any method you want secure. I would love to hear from Developers and Penetration Testers alike!

Finally, and I will probably write an entire post on this, but you can find plenty of information out their on the web. Make sure when you generate your own session id, you use secure randomization so the Session Id token is unguessable and incapable of being enumerated through a simple algorithm.

In Java there’s two very simple solutions, use the SecureRandom class and NOT the Math.random or Random class, or you can even use the UUID class to create a globally unique identifier.

Just another stream of Random Bits…

-Robert C. Ilardi

Posted in Architecture, Development, Security | Leave a comment

Writing a Good Job Description for Hiring Core Java Developers

As a Development Manager or a Team Lead, you often need to write up Job Descriptions which include a brief description of the Team, and the Role’s responsibilities. Some people also include a description of the company, but I usually don’t find this necessary if you work for a large well known company. However if you are a startup or smaller business you might want to include a short paragraph on your company.

I mostly focus on stating in the description the roles responsibilities and often include the following phrase: “This role requires hands on coding in [LANGUAGE] on a daily basis. This is NOT a Lead, Manager, or Architect position. However you may be required to participate in architectural discussions as needed. This is a [Senior | Mid-Level | Junior] Software Developer Role.”

Often when hiring Senior Developers, you will fine many candidates have already ventured into the leadership or management or architecture roles and do not really want to code on a daily basis, and you need to make it clear to all perspective candidates that you require hard core development.

When hiring you want to ensure you not only get candidates that CAN code, but candidates that WANT to code. I can’t tell you how many times I’ve seen groups hire very smart people who can develop, but just don’t have the drive anymore to code on a daily basis, and when you need developers you have to ensure you hire committed coders.

In addition to Team Description, Role Responsibilities, and potentially a short description on the company, the most important section of the Job Description write up is the required skill set.

When hiring Core Java Developers, which is my preferred developer title to hire for both Back End and Middleware work, I normally use a list of skills that include the following. I added comments to the list for the purpose of this article to help not only hiring managers but help potential candidates as well understand the method behind my madness:

  • Required Skills
    • Core Java (Java Version: 7+ ; or whatever version your organization requires)
    • OOP/OOD in Java
      • Interfaces, Classes, Polymorphism, Inheritance
      • Question I usually ask: Can you have an Empty Interface (Also called a Marker Interface), and what is it used for. What is the most popular Marker Interface that is standard with Java.
      • Design Patterns (GoF Patterns)
        • I always test for the Singleton Pattern.
        • Other common ones I frequently test for: Factory, Observer/Observable, Visitor
        • Less Frequently I’ll test for: Command and Chain of Responsibility
        • There are many other GoF (Gang of Four) Patterns, but I would get a hang of Java first.
        • Note: GoF Patterns are not limited to Java. You can implement these design patterns in any Language. The 4 authors of GoF (where the Four in Gang of Four come from) even state, they really just gave names to Design Patterns programmers have been using for decades, and they are right, a lot of the patterns you will recognize as designs you implemented yourself without even knowing the “official” name.
    • Collections (Lists, Maps, Sets)
      • Also able to code with Arrays directly
      • Able to use Arrays in the place of: Lists, Maps, and Sets efficiently
      • We test deep knowledge of collections. Like how to man a custom key work correctly in a HashMap.
      • Note on Arrays:
        • Because of complex Data Structures I use Arrays are also very important, even though Arrays may seem simpler, they are actually harder to use correctly and to your advantage, which is why collections were invented in the first place.
        • In a traditional Computer Science program you will learn and use Arrays before you even talk about collections.
        • Simple collects can be made from Arrays, which is why I state above, “able to use Arrays in place of the various collections”.
    • Exception Handling
      • Knowing how to handle the different types of Exceptions:
        • Checked (Any Exception that Extends from the class Exception)
        • Unchecked (Any Exception that Extends from the class RuntimeException)
      • What a Throwable is
      • What and Error object is and how it is different from an Exception.
      • Logging of exceptions. It’s a Plus to know: Log4J, but I usually don’t test for it. As long as the person knows why it’s important to log exceptions.
    • Direct JDBC experience
      • Ability to use JDBC to Call Direct SQL Statements for both Query and Updating
      • Ability to use JDBC to Call Stored Procedures
      • Knowledge of Transaction Control using JDBC
      • Basically know what the following classes are and how to use them: DriverManager, Connection, Statement, PreparedStatement, CallableStatement, ResultSet, big plus: ResultsetMetadata.
      • I don’t usually care about ORM (object-relational mapping) Frameworks, such as Hibernate. Actually I hate Hibernate and frameworks like them, even though in some projects they are very popular.
      • Note on Connection Pools:
        • When you are building Web Apps and other applications hosted on an Web Server or more generally an Application Server like WebLogic or WebSphere or JBoss, you will usually use a JDBC Connection Pool. Knowledge of Connection Pools technically fall into the JavaEE space, and not the Core Java space, but in the real world, you will most likely be using connection pools to create your connections and NOT the JDBC DriverManager directly. However besides Connection creation, everything else is the same when doing Direct JDBC or JDBC via a Connection Pool.
        • I state “Direct JDBC Experience” because this is how you can separate the Men from the Boys. A real developer will know how to create Connections directly from the JDBC DriverManager. More junior programmers may be part of a team that hides all the JDBC stuff from them and they just somehow magically get a connection to a Database in their code. Perhaps by using a Utility class a more senior developer on the team created for the team to use. In my projects we always do this, I architected an entire framework called App Resource Manager to handle JDBC connection management, whether it’s using Pools or Direct DriverManager.
    • Strings and I/O
      • Ability to read large raw data files and parse them into usable tokens for DB Loading or other processing
      • String Matching and Manipulation
        • Matching: Basically the String object’s indexOf, startsWith, endsWith, lastIndexOf, plus RegEx (Regular Expressions) ie. the “matches” method.
        • Manipulation: Building of Strings using StringBuffer or StringBuilder, plus the String’s split, substring, trim and replace methods.
        • String Parsing / Tokenizing
          • The String’s “Split” method is used more and more over the StringTokenizer class these days.
          • Basically if you are reading a Delimited Text File, like a “|” Pipe/Vertical Bar or a Comma Delimited File, or anything similar, you will be splitting or tokenizing each line.
          • It’s all about File Parsing or User Input Parsing.
        • Reading and Writing from/to Properties Files, XML, Plain Text Files.
    • Experience with Multi-Threading
      • Synchronization
        • Block Level
        • Method Level
        • Static Method / Class Level
      • Thread creation and control
        • Runnables and the Thread class
          • Creating a Thread using the Runnable Interface verses extending the Thread Class.
        • Wait, Notify, NotifyAll
          • The Class Consumer/Producer Problem.
    • Basic SQL Knowledge is a must.
      • DML: Ability to write Select, Insert, Update, Delete, statements
      • DDL: Ability to Create Tables is a Plus but not required. I usually either do the data modeling and table creation myself or one of a hand full of trusted developers I can trust to design good tables, indexes, and constraints.
      • Stored Procedures – Writing Stored Procs is a plus, but not required, I usually have a good SQL developer on my teams.
    • XML
      • Familiarity with JAXB
      • Knowledge of : SAX, DOM, STAX
    • Knowledge of Java Annotations
      • I used to test for this less often, but Annotations have become so wildly used, I started testing this more.
      • Mostly someone just needs to know how to apply annotations, not create their own annotations. I have created my own annotations in projects before, but most development solutions probably won’t need custom annotations, just use the ones that come with Java or were created for a particular Library

After the Required Skills sections, I usually add a Optional Skills sections that are a Plus, which usually include more JavaEE, Web Services, Linux/Unix Scripting, and Command Line, and depending on what products we are using, a DBMS, Hadoop Ecosystem Components, Workflow Engines, Messaging Services, and whatever else is specific to the project I’m hiring for.

Hope this post helps both hiring managers write better Job Descriptions and Candidates or Students of Programming who are looking for a syllabus to study to get ready to apply to jobs and enter the Software Development Job Market.

Just Another Stream of Random Bits…

– Robert C. Ilardi

Posted in Software Management | Leave a comment

The Lehman Brothers Experience…

I originally published this article on my personal blog “Just another stream of random bits…”, back in September 2008, a few days after the now famous Bankruptcy of Lehman Brothers. Today being the 7th anniversary of the bankruptcy, I was thinking about how great it was to work for Lehman, and how sad it still is today to have lost an American company started in 1850, and I thought I would share this article once again. The original text of my 2008 article is below:

The Lehman Brothers Experience…

To my friends and family,

As you all know, I work for Lehman Brothers. And I’m pretty sure by now, you all have heard that the company filled for Bankruptcy earlier this week.

I have worked at Lehman for my entire professional career. Starting as a lowly summer intern in June 1998. Working my way up to a part-time hourly employee during my undergraduate studies at Polytechnic University, starting in May 1999. And finally being hired as a full time employee in May 2001. I have been employed at Lehman uninterrupted since May 1999.

Lehman has been more than a place where I go to work everyday. It is a place were I have made many friends on the professional level, and some of those relationships have grown beyond the office. My three managers I have had since I started working at Lehman all attended Paula and my wedding back in 2005, and I have attended their weddings, Christmas parties at their homes, and even Chinese New Year celebrations.

I knew it would always be hard to finally say goodbye when the time came for me to leave this chapter of my life behind. At this point, it looks like me and along with probably a good 95% of the rest of the firms employees will be forced to leave before most of us were ready to.

What’s worse is that there are many people who have spend the bulk of their professional careers at Lehman and as they are approaching retirement, they are in bad shape and I ask each of you to keep them in your prays. Lehman was the last bank on Wall Street to have a pension plan, which most of the long time employees are depending on for a significant portion of their retirement incomes. I know many individuals from my department who have been affected in this way. As of today, we still do not know what will happen with the pensions, so please let’s all hope and pray for the best. Us, younger employees will be ok, but those old-timers will really be feeling the pain regarding this issue in particular!

The Lehman Brothers culture, good or bad, was one of Pride and Family. There was a sense that we are all in this thing together and that we are all contributing to making the firm a direct competitor to the likes of Goldman Sachs. When I first started at Lehman, people told me, “Why are you joining that company!” “Come on Rob, you can do better than Lehman.” In fact no one besides myself, from my graduating class from Poly joined Lehman (although some did have offers and some were even interns along with me), They all thought they could do better.

The funny thing was, they all changed their tune after a few short months. Most of them lost their jobs do to the slowing economy and 9-11. Lehman although displaced from it’s global headquarters (World Financial Center – Tower 3 / The American Express Tower) which was heavily damaged during the terrorist attacks, was committed to rebuilding a strong firm than ever. They soon succeeded; we purchased a new Headquarters in midtown Manhattan, just north of Times Square, 745 7th Ave. We got a new building in Jersey City to replace our outdated and overcrowded 101 Hudson IT and Operations Jersey City building. We began expanding both domestically and internationally. Every week we would receive multiple internal communications on new acquisitions and deals being made across the global. They started rebranding the company, and we went from logos that were just gold letter of the words “Lehman Brothers” on a plain and solid dark green back drop to “Lehman Brothers – Where Vision Gets Built…” flashing on full color, jumbo screens on the sides of the new 745 building in Times Square. Each of our regional headquarters in both London and Tokyo were also being upgraded. A few short years after 9-11 which everyone predicted would be the end for Lehman, we purchased the Asset Manager Neuberger Berman. This marked a turning point in Lehman’s history. We started to risk among the ranks of the Wall Street Power Houses, and instead of being comparing to companies like Bear Stearns, we were being compared up against Goldman and Morgan Stanley.

When I would visit Paula up at Syracuse University when she was studying for her MBA, her classmates would approach me asking how is it to work at Lehman. That is the company they want to work for. Could I give them any tips on how to get their resume in front of the right people to be hired by the firm. And it wasn’t limited to Syracuse and a bunch of young and hungry MBA students. Everywhere I went, when someone heard I worked for Lehman, they wanted to speak with me about the company and if I would mind taking a look at their resume.

I was so proud to work for Lehman. In just a couple of years, it went from a company of only 7000 employees worldwide to over 25,000. It went from the company where people told me is run as far away from as possible, to the company where everyone wanted to work for.

Lehman was not a 9 to 5 shop, most managers, especially in the IT side, supported flex time and result oriented management. Meaning they didn’t care if you worked from home, or came in an hour late so long as the deliverables were completed ahead of schedule and with high production quality. Of course, as my wife Paula can tell you, this also means, you are on call 24 by 7. You not only carry a company Blackberry, but a pager as well. They have no problem calling up all the hotels in Las Vegas to find you and get you working on vacation. Trust me, it actually happened to me. But it all balanced out for the most part…

Lehman was known for treating their people right. Salaries were definitely comparable to the rest of the industry. Bonuses were great. Year End Parties were fun with Town Cars for each employee to go home in. The flexibility to work from home when needed was amazing and will truly be missed. Departments so long as they were “successful” were given many perks. My department for example always had up to date Computers and dual 19in flat panel monitors for just about everyone. Office Supplies were readily available in draws in front of the department admins. Free lunches at all large or long meetings. And we even went on a boat trip on a chartered NY Waterway Ferry with free beer and wine last year as a “team building” trip for the entire department. Training classes both technical and soft skills were made available to ALL employees freely. These classes would cost anywhere between $1500-3000 for a single day course to $5000-7000 for a five day training class. You potentially could have taken as many of these courses as you wanted to, so long as your direct manager approved. Being a manager myself, I have never seen an employee abuse the privilege and I always approved all request for these courses. If you could find a PC that wasn’t in use, in most cases the company and your managers would allow you to take it one along with Flat Panel Monitors to help with night support and working from home!

As you can see, it was an easy place to fall in love with, in terms of a place of business. Which is why there are so many employees who are having such a hard time saying good bye. I’m one of those employees. What has happened to Lehman over the past year has taken a toll on me in so many different dimensions. Financially (the obvious one), emotionally, mentally, and physically. Over the past two years, I have been working extremely hard on “self betterment.” I have been meditating, attempting to resolve my stresses through open communications and relaxation, listening to soothing New Age music, dieting, exercising, resolving my medical issues such as my Ulcerative Colitis, etc…

I feel that I have taken many steps in this process, but now with the stress and due to the “unknown” with the Lehman situation I definitely feel like I have take a few steps back.

I am extremely saddened by the down fall of Lehman. I feel like it is the end of an era. I have given Lehman so much, but they have also given back to me so much, in a world that depends upon money. I can’t imaging a world without Lehman. I did not know if I would be at Lehman for my entire career and in fact, I felt that most probably I would not be. However, I thought, I might eventually return to Lehman as so many employees who have worked for the company do, and possibly retire there. It literally brings tears to my eyes to think that this is no long an option. To think about walking by 745 7th ave, and reminiscing about all the meetings and presentations I gave there to Managing Directors for the Trader Portal project. To remember the feeling of pride as I walked through the doors of the building of my company; a company that was 30-33% owned by the employees.

Thank You for your time in reading this,

Robert Ilardi (Sept. 2008)
Vice President
Enterprise Account Management
Lehman Brothers, Inc.
70 Hudson St., 9th Floor
Jersey City, NJ 07302

Posted in Personal | Leave a comment

Another Class In the JAR, Part 1

Another Class In the JAR, Part 1

I wrote this back when I was a lowly Senior Developer at Lehman Brothers. When I write code, both, now and back then, I listen to music, usually Metallica or Pink Floyd, and sometimes SiriusXM’s Octane (Yes, I had an Air Card for my laptop since 2005)…

Written By: Robert C. Ilardi Date: March 1, 2005 Obviously a parody of Pink Floyd’s “Another Brick in the Wall” (Technically it’s a parody of Part II) –

———————————————————————————————–

We don’t need no Project Management

We don’t need no Source Control

No Code sarcasm in the Cubicle

Managers leave them Programmers alone

Hey! Managers! Leave them Programmers alone!

All in all you’re just another Class in the JAR…

All in all you’re just another Class in the JAR…

———————————————————————————————–

For you non-Java Programmers out there, here’s some definitions for ya!

Class in general

Specifically a Java Class File

JAR in Java

Code or Source Code

Definition of Source Control

So to sum it up: Programmers write Code or “Source Code” in Cubicles, that Source Code is organized into Classes, and in Java, lots of classes are stored within a JAR File. Programmers do not like Project Management, and Project Managers like to make sarcastic remarks about programmers codes when standing above them in their Cubicles, while reminding them that all their source code changes better be in the Source Control System such as SVN or CVS!

Just Another Stream of Random Bits…

– Robert C. Ilardi
 
Posted in Randomness | Leave a comment

How to Share your Starbucks Smart Phone Bar Code with your Significant Other

It’s pretty simple. Since the Bar Code is just a representation of your gift cards Serial Number, all you need to do is take a Screen Capture from your phone of the generated Starbucks Card Bar Code.

If you do not know how to take a screen capture with your particular smart phone, just google it. But here’s the iPhone instructions: Press the Home button and the Power/Sleep Button at the same time: Here’s a link: http://www.wikihow.com/Take-a-Screenshot-With-an-iPhone or this one on the Apple Forum: https://discussions.apple.com/thread/3739872?tstart=0

Then you can email it to your wife/husband/boyfriend/girlfriend, or whoever else you want to use your card.

Screen Cap Like So (Obviously I removed the real bar code and numbers from this picture):

starbucks_app_screen_cap

PLEASE NOTE: This does not fool the system into giving you free Starbucks! Nor was this an attempt to try to circumvent the system.

Also, DO NOT Send this to anyone who you do not want to use your Starbucks card and your money, treat this screen capture as if it was your real Starbucks card, because it is! Your balance will decrease each and every time someone scans the photo on their own smart phone at the Starbucks register.

Also, what’s cool about this method, is you do not need to share your password, and the person you share the photo with doesn’t even need the Starbucks App installed on their phone!

Just Another Stream of Random Bits…
– Robert C. Ilardi
Posted in Randomness | Leave a comment

You’re Not Thinking Fourth Dimensionally

This is a quote from the textbook we used in the Programming Languages course I took in college: “It is widely believed that the depth at which we can think is influenced by the expressive power of the language in which we communicate our thoughts. Those with a limited grasp of natural language are limited in the complexity of their thoughts particularly in depth of abstraction. In other words, it is difficult for people to conceptualize structures they cannot describe, verbally or in writing.” – From: Concepts of Programming Languages; ISBN: 978-0133943023; by: Robert W. Sebesta; Computer Science Professor Emeritus at University of Colorado at Colorado Springs, College of Engineering and Applied Science.

I believe that has we progress in our professional careers as software engineers, system architects, and development managers, we learn certain patterns, and no matter how creative we are, we slowly start to normalize into a recognizable pattern of designs, coding concepts, languages, and project management techniques, that peers can look at and say, “Yes that has the mark of [INSERT YOUR NAME HERE] evilgrin“.

It’s not that we get lazy, or lose our creativity, I just believe it’s human nature to fall back and lean on what we have learned “works”. And there’s really nothing wrong with that in itself (we obviously fall back on our education and values we have picked up from our family, friends, churches, and cultures, constantly over our lifetimes), however I believe as creative individuals, programmers and people who are at their heart programmers, need to stay creative, not only for their jobs but for their own sanity.

As Robert Sebesta has said, our thought processes are limited by the language we know. I believe it’s not only written language but also visual representations as well, and how we can use those “thought elements” or our vocabulary both in the written/spoke language sense and visualized elements to create new and amazing designs, systems and applications. Although in programming, eventually we need to describe these designs using words and math. Especially if we are working in teams, and need to produce readable, reusable, and maintainable code. See my article on “SFEMS – Stable, Flexible, Extensible, Maintainable, Scalable“. As good programming citizens we aren’t supposed to start naming everything a, b, c, a1, i2, j, k, etc… evilgrin

So how do we stay fresh? Is it merely a matter of reading new books and watching the same types of movies we always have? No, I think this is the trap. We need to Expand our Minds. We need to “Free our Minds”, yes, from the Matrix. If our creative thought processes are limited to the language elements we know and understand, then, if we limit our input of new data, vocabulary, stories, music, and visual imagery to those same basic genres, that we always watch, read, and otherwise consumed, then we really aren’t going to expand our base knowledge stores that our creative parts of our brain randomly and so naturally pull from to create new and interesting ideas, designs, and in the case of programmers, new systems and applications.

I love the quote from Doc Brown in the Favorite movies of all time “Back to the Future“, where he tells Marty “You’re Not Thinking Fourth Dimensionally.” (Although he doesn’t say this until Part III. Check out FuturePedia on the Fourth Dimension.) While he of course is referring to time as the fourth dimension, I’m thinking of this quote more generally, to mean, we aren’t thinking outside of our normal frame of thought. We aren’t considering new and different ideas, movies, music, art, books, and other information sources during our programming down time in order to expand our “thought element vocabulary”. Again what I mean by thought element vocabulary, is our knowledge base of ideas and concepts that we use to create and design and express ourselves in programming.

I believe the exposure to information sources that we already enjoy does not expand our minds after a few decades of consuming these books, comics, movies, etc, instead we need to look to other genres, and even activities to further expand our minds.

If you are a sci-fi fan, how predictable are the plot elements to you? Here and there, you will find something absolutely surprising, but for the most part, after a couple of decades of consuming all the sci-fi you can, you probably will see patterns in the storylines, and I’m pretty sure you make comparisons how one movie just ripped off half the story from another, or complain how there’s really nothing new out there and Hollywood just sucks (while I don’t disagree here evilgrin).

So what I started to do, is started reading and watch different genres of books, comics, and movies. I have also started to try to pick up new activities as well. Example: If you are a gamer, how long have you been playing games? In my case it’s since the mid-1980’s. While I’m not saying to stop playing games, all I’m saying is to try something new. For example, recently, I pick up drawing and reading up and practicing meditation.

I believe learning these new activities and consuming these new genres will help me to expand my mind, and keep my creativity fresh and exciting. While I’m not saying or encouraging you to pick up something that you feel you would never like, I’m sure their other genres and activities out there that you haven’t tried yet, and would find enjoyable.

I also want to encourage you to day dream a little more. I remember as a child I would love to day dream about new inventions, and space adventures, saving the world, etc. I think half of my ideas have some correlation to my childhood day dreaming. I now see it as some sort of meditation. Of course as adults, we can’t day dream at work like we may have done in the classroom from time to time, or can we? evilgrin

Just Another Stream of Random Bits…
– Robert C. Ilardi
 
 
Posted in Personal | Leave a comment

Singletons and Factories of Singletons

A Singleton is a design pattern that allows for a One and ONLY One Object instance of a class  to be instantiated within the memory of a process using object oriented programming concepts.

Maybe people will ask, well why can’t I use a Static instance of a class in my program as a field or data member? Well, you can, however in larger scale programs, you may not want to have to pass around the static instance of this class from class to class or method to method. This often will over complicate your code and make it harder to maintain.

Or, can’t I just use all static fields/data members and methods/functions in my class to accomplish the same thing? Again, you can, however you lose a lot of the benefits of having an true SINGLE instance of an Object. You are now dealing with a object that is less flexible than if you made a Singleton instance of that class.

There is some debate on this between programmers whether you could just use all static methods and fields verses a singleton, but a singleton’s behavior is more predictable across languages, because the design and ordering of creation is controlled by the programmer instead of the system. So my own preference is not to use a class with all static members and go with a Singleton.

I also personally feel, that a class with all static methods, should be restricted to classes that only contain helper methods, such as a class that contains String manipulation helper methods. I usually have this type of class in my systems, called “StringUtils”. All methods are public static, and stateless. They are used to perform a common string manipulation function and return.

By using a Singleton, you have more freedom to design the true behavior of your class, including how the class is created and initialized. This where the Factory of Singletons comes into play a well. I usually like to create a Factory which itself is a Singleton, that creates a Singleton. This factory than can be used to return an implementation of a interface which a range of Singletons could define the implementations of. The factory can also be used to correctly call the initialization methods after the “GetInstance()” static (and atomic aka synchronized) method is called to obtain the instance of the Singleton. This is perfect in case you need to call a specific initialization function or set root parameters ton the Singleton that you don’t want the user of your singleton code to have to deal with.

The single most important factor that makes a class a Singleton is a “private constructor”.

By having a private constructor, there is NO other Class except for itself that can create an Object instance of that class.

The private constructor is not by itself how a Singleton is built, technically, I could have a private constructor and still have a static method that returns unlimited new and distinct instances of the class, so we need a “Get Instance” method, basically a Factory method, that ensures that One and Only One instance of the class we are trying to make a singleton is created in memory for a particular process. Because the constructor is private, the only way to access this Get Instance method is to make it public and Static. (I recently read a post of Java’s facebook page asking for a one line explanation why I would use the static keyword to modify a method signature. Well, simply: We use the static modifier on a method to make that method available at the Class level. This means that there’s only one instance of that method in memory for all object instances. This also works to our advance for the singleton implementation, because I can’t create a object instance of the singleton class myself, so calling public static methods is the only way to interact with the Singleton, before I call the Get Instance method which will return the one and only Object instance of the singleton class.)

This is usually done by using lazy instantiation in the Get Instance Method. Lazy Instantiation or Lazy Initialization is when in programming, we do not “pre-create” or pre-initialize an Object or other structure or section of code until the first invocation of usage of that object, structure, or code is required. This is a great concept, when working with limited resources such as Memory, or when perhaps, we don’t know if we ever need a instance of something, and why bother wasting processing cycles and RAM on creating that object.

The third and final component that makes a class a singleton, is a private static field whose data type is the Singleton’s class itself. This field would be set by the lazy instantiation code of the Get Instance method. And is used in the lazy instantiation check, if it is NULL, call the private constructor and set the private static field. Then simply always return the private static instance of the singleton class. This causes our class to be a singleton. The only object instance that exists, and will ever exist while the process is running is the private static instance of that class.

The Get Instance method also needs to be atomic or synchronized, to make sure it’s thread safe. If not, technically two methods calls to the Get Instance by two separate threads could potentially create two or objects of the singleton, and all of those objects would eventually be lost once the threads stop referring to them, and only the last one that set the static instance field of the Singleton class instance itself would survive, and this potentially could cause data corruption or other weird runtime related issues, because we are assuming you need a single instance of an object globally for some critical reason, not just for fun… evilgrin

Perfect example of a simple Singleton: “GlobalMap” –

/*
 * Created on March 1, 2004
 */

package com.roguelogic.util;

import java.util.HashMap;

/**
 * @author Robert C. Ilardi
 */

public class GlobalMap {

  private HashMap<Object, Object> cache;
  private static GlobalMap globalSpace = null;

  private GlobalMap() {
    cache = new HashMap<Object, Object>();
  }

  public static synchronized GlobalMap getInstance() {
    if (globalSpace == null) {
      globalSpace = new GlobalMap();
    }

    return globalSpace;
  }

  public synchronized void clear() {
    cache.clear();
  }

  public synchronized void store(Object key, Object value) {
    cache.put(key, value);
  }

  public synchronized Object retrieve(Object key) {
    return cache.get(key);
  }

  public synchronized Object remove(Object key) {
    return cache.remove(key);
  }

  public synchronized boolean containsKey(Object key) {
    return cache.containsKey(key);
  }

}

syntax highlighted by Code2HTML, v. 0.9.1

As you can see from the simple class above “Global Map”, the idea was to create a Singleton Hash Map, so that we could share the HashMap through a java process without having to pass it from class to class. And in large programs with 100’s or even 1000’s of classes this would not only be impractical, but bad coding practice. The Global Map uses the 3 root concepts of the singleton design pattern:

1. Private Constructor: Line 18 –

private GlobalMap() {
cache = new HashMap<Object, Object>();
}

2. Private Static Singleton Class Data Type Data Member: Line 16 –

private static GlobalMap globalSpace = null;

3. A thread safe Public Static Get Instance (factory method), which uses lazy instantiation to create and initialize the private static singleton class data type data member: Line 22 –

public static synchronized GlobalMap getInstance() {
if (globalSpace == null) {
globalSpace = new GlobalMap();
}

Factories of singletons are simply Factories that return instances of singleton implementations. A factory of singletons could simply always return the same singleton implementation or it could based on criteria return different singleton implementations.

The advantage of using a Factory that creates the singleton, is that the factory could contain additional logic outside of the singleton that calls certain initialization methods and sets configuration settings on the singleton at the first call to the “create” method in the factory so that the singleton is initialized once and only once. This abstracts the responsibilities for initializing the singleton correctly on the first call to the Get Instance method on the singleton itself, away from the end users of your singleton class.

Technically you could put the responsibility of calling certain initialization methods on the singleton after the first call to the Get Instance method on the singleton in the hands of your users, but then they would have to put counter or flags to indicate when the first call of Get Instance was called. This in a sense negates some of the direct benefits of a singleton, and therefore wrapping this logic yourself in a Factory if a much better choice.

However a Factory of Singletons should only be used if you need a complex initialization routine after the call to the private constructor of the singleton that requires a lot of outside input or parameters to be set on the Singleton that is too cumbersome to be done within the singleton itself.

There’s also a neat trick I have used in the passed when combining Factories and singletons. Sometimes I remove the Get Instance method from the singleton itself, and instead create a “Protected Constructor” instead of a private one. Although this technically means it’s not a true singleton, and in a language like Java, this allows “friends” or classes that exist in the same package to invoke the constructor directly. If you put the Factory of the Protected Constructor Singleton in the same java Package, it can then be responsible for creating the singleton object instance instead of the singleton itself.

Also, within a Get instance method in the singleton class itself, this basically forces the users of your code to ensure they go through your Singleton Factory always instead of trying to construct the singleton themselves and possibly messing up the initialization procedure which should be done by calling the create method on the Factory instead.

I think this shows the real power of design patterns, they are more guidelines than anything else, and you can use them in combination or modify them to suit your needs.

My posts on design patterns seem to be the most popular posts on my blog to date (besides my Tesla Coil post evilgrin), so I’m definitely going to write a post on Factories and Abstract Factories in the near future. Please stay tuned for that! In the mean while, please feel free to check out my design pattern post on Adapter Factories.

Just Another Stream of Random Bits…
– Robert C. Ilardi
 
 
Posted in Architecture | 1 Comment