Big data has revolutionized the techniques used in data analysis as large enterprises have huge datasets that keep growing every day. These datasets require large storage capacity and management and can be structured, semi structured or unstructured. They also need different software frameworks, which differ from conventional databases such as the Hadoop framework which make data management easier in this era of large-scale data. Hadoop Distributed File System (HDFS) is critical in managing data that has been stored in cloud systems. Hadoop has ability to solve the problem of handling and processing a huge amount of data in various formats as traditional RDBMS (Relational Database Management System) approach is inefficient and time consuming for processing of large data sets. This thesis explores using the Hadoop Ecosystem to analyze a particular data set, based on images. The data was obtained from a large collection of machine learning data sets at UC Irvine. To apply this sample exploration to actual large data, one can automate obtaining properties of images, and then go to picture sharing sites like Flickr, and explore truly big data in various ways.