The Battle of Neighborhoods

Andreas karlsson
6 min readMay 1, 2021

1. Introduction: Business Problem

Background: According the latest NYC&Company release New York City welcomed about 65.2 million tourists in 2018 year — 51.6 million domestic and 13.5 million international visitors. And these numbers are continuously increasing from year to year — https://en.wikipedia.org/wiki/Tourism_in_New_York_City. New York City has the largest selection of lodging choices in the country — from the hostels to the luxury hotels. The prices vary from 100$ till several thousand dollars with average price 292 USD per night. The Hotel Occupancy rate is also high — in 2018 year it was 88% — https://assets.simpleviewinc.com/simpleview/image/upload/v1/clients/newyorkcity/FYI_Hotel_reports_February_2019_8607015b-b32a-4c7f-9fbd-84cd2a93cbe6.pdf Visitors prefer short stays that are often over weekends — averaging 2.4 nights — https://aka.nyc/content/uploads/2017/12/new_york_city_travel_and_tourism_trend_report_2017.pdf

Problem description In New York City there are almost 300 hotels with over 75,000 hotel rooms and Airbnb has more than 50,000 apartment listings in New York City in 2018 year — it can be hard to find the right fit or know how much you will get with your money. In this project we will try to find the most optimal borough in Manhattan where a tourist can rent an accommodation via Airbnb service and have a pleasant stay in NYC and a possibility to attend the most visited attractions like Central Park, Times Square and so on.

Target Audience This investigation would interest New York City’s visitors who prefers short stays (from 1 night) and wants to select the best neighborhoods in Manhattan, New York.

Success Criteria The success criteria of this project will be a recommendation with the set of apartments clusters have the best score calculated based on

Accommodation price with fees; Location of the accommodation; Venues in radius of 1000 meters from the accommodation; Crime rate in radius of 100 meters from the accommodation.

2. Data

In our investigation we will use the free and public available datasets. We will try to evaluate available Airbnb 2019-year accommodations on Manhattan, New York and define the most reasonable apartments sets(clusters) for the visitors.

Based on definition of our problem, we suppose that factors that will help us are:

accommodation prices average by neighborhood; number of tourist attractions near the accommodation; number of crimes nearby the accommodation. Following data sources is needed for our project:

New York City apartment listing from the Inside Airbnb site — http://data.insideairbnb.com/united-states/ny/new-york-city/2019-12-04/data/listings.csv.gz New York Neighborhoods Tabulation Areas — https://data.cityofnewyork.us/api/geospatial/cpf4-rkhq?method=export&format=GeoJSON Foursquare API to extract data about venues — food places, museums, galleries, shopping centers, sightseeing attractions, concert halls and so on New York Crime data for 2019 year — https://data.cityofnewyork.us/api/views/5uac-w243/rows.csv?accessType=DOWNLOAD

3. Methodology

In this project we are trying to detect Manhattan’s Neighborhoods that have accommodations for rent with positive reviews, reasonable prices, low number of crimes and tourists’ attractions nearby.

In the first step we have collected the following data:

Airbnb Accommodations with their NYC Tabulation Area (official neighborhood names); Airbnb Accommodation’s number of crimes nearby; Defined NYC Tabulation Area (official neighborhood name) for each Manhattan’s crime case. The second step in our analysis will be a calculation and exploration different neighborhoods of Manhattan. We will explore the following characteristics:

number of crimes in the area; average price per person; number of accommodations available. In third and final step we will

select Top-100 Airbnb accommodations based on summary rating, number of crimes and price per person, and invoke Foursquare API to find Top accommodations’ nearby venues create and investigate clusters (using k-means clustering) for our accommodations to make some recommendations to our tourists.

5. Results and Discussion

During the analysis, three clusters were defined. All clusters have a ‘Food Place’ category as the first Common Venues. This is what we have in common among our clusters. But they are distinguished by the other characteristics as

average Price per person average Crimes Rate the second Common Venues number of available Airbnb accommodations neighborhoods location Cluster 0 — Mix is the most generic cluster — it has a

average price_per_person — $110; average crimes rate — 67 (but very varying — depends on the neighborhood, from 3 to 385 crime cases in radius of 100 meters from the accommodation); mix of all Venue Categories (Fine Arts, Shopping, Entertainment); contains 58% from all accommodations selected from analysis (Top-100 Airbnb accommodations); spreads almost on all Manhattan’s areas. Cluster 1 — Entertainment is the smallest cluster with the following particular qualities (Nightclub, Stadium, Pub, Theater, Concert and so on):

highest average price_per_person among all clusters — $111; highest average crimes rate among all clusters — 102; Entertainment is 1st and the 2nd Top Common Venue Categories; contains 15% from all top accommodations (Top-100 Airbnb accommodations); spreads on Chelsea, Hell’s Kitchen, and Midtown Airbnb’s Neighborhoods. Cluster 2 — Sightseeing is the cheapest one with many Sightseeing attractions nearby (Monument/Landmark, Memorial Site, Historic Site, Lake, Park, Pier, and so on)

lowest average price_per_person — $59; lowest crimes rate among all clusters — 65; Sightseeing is the second top Common Venue Category; contains 27% from all top accommodations (Top-100 Airbnb accommodations); spreads on East Harlem, Financial District, Harlem, Inwood, Morningside Heights, Roosevelt Island, Upper West Side, Washington Heights, West Village. We identified three clusters from which a visitor could choose an appropriate accommodation based on his/her preferences or needs.

Top Neighborhoods Statistics Top-5 Manhattan Tab Area (Airbnb Neighborhoods) with Lowest Average Price per Person in 2019 year:

Marble Hill-Inwood (Marble Hill, Inwood) — 45.48 USD — 25 accommodations Washington Heights South (Washington Heights) — 46.79 USD — 82 accommodations Washington Heights North (Inwood, Washington Heights) — 54.74 USD — 53 accommodations Central Harlem North-Polo Grounds (Harlem, East Harlem) — 57.00 USD — 132 accommodations Manhattanville (Harlem) — 59.75 USD — 25 accommodations Top-5 Manhattan Tab Area (Airbnb Neighborhoods) with the Lowest Crime level in 2019 year :Stuyvesant Town-Cooper Village (Stuyvesant Town) -145 park-cemetery-etc-Manhattan (Inwood, Washington Heights) — 1,213 Lenox Hill-Roosevelt Island (Roosevelt Island, Upper East Side) — 1,604 Manhattanville (Harlem) — 1,832 Yorkville (Upper East Side) — 1,898

6. Conclusion

To conclude, the basic data analysis was performed to identify Manhattan’s Neighborhoods clusters for a short stay visit. During the analysis, we cleansed and investigated Manhattan Neighborhoods’ datasets, found some statistical characteristics and visualize them.

The aim of this project is to help Manhattan visitors select the Airbnb neighborhoods where to stay based on the most common venues, price policy, and safety characteristics:

if a person is interested in Entertainment (Nightlife, Pubs, Concerts, Movies) we recommend paying attention for accommodations from the Cluster 1 — Entertainment: Chelsea, Hell’s Kitchen, and Midtown Airbnb’s Neighborhoods. But the person should take into the consideration the high prices and crime rate for this location; if a person is looking for a neighborhood with lower prices and nice views nearby we recommend looking at Cluster 2 — Sightseeing: Chelsea, Hell’s Kitchen, and Midtown Airbnb’s Neighborhoods; if a person does not have any preferences — investigate proposals from Cluster 0 — Mix. It has average prices and spreads over almost all Manhattan’s neighborhoods.

--

--

Andreas karlsson
0 Followers

Financial controller and data analyst