ALY 6110 Northeastern Crime Analysis Report in Boston Project Analysis

Overview and Rationale

Spark’s intended use is for data lakes which were discussed previously. It is important to be able to process these large data sets effectively with Spark. This assignment will provide you with experience and practice in using Spark to analyze a large data set.

Assignment Summary

For this assignment, you will download and process, with Spark, two of the following datasets.

I am sharing some resources with you, but feel free to pick your own problem/dataset.


Write a 3-5 report that includes a section for each data set you choose to analyze. For each data set include

  • A description of the steps you took to perform the analysis, with screen shots
  • Results of your analysis
  • Your insights based on your analysis

Format & Guidelines

The paper should follow the following format:

(i) Introduction

Provide a short description of the dataset you analyzed and purpose for the analysis. Identify questions you are attempting to answer with or insights you want to gain from the analysis

(ii) Analysis and results

Outline your steps, with screen shots, and provide the results of your analysis. Connect the results and your analysis to the purpose described in the introduction. Be specific.

(iii) Insights

Provide your insights based on your analysis. Connect your insights to the purpose of the analysis.

