본문 바로가기

Crawling/데이터 시각화

데이터시각화_10가지_Geo chart

파이썬 시각화 차트 종류

1. Column/Bar chart
2. Dual Axis, 파레토 chart
3. Pie chart
4. Line chart
5. Scatter chart
6. Bubble chart
7. Heat map
8. Histogram
9. Box plot
10. Geo chart

 

 

목표 : 2017년 마라톤 데이터를 지도 상에 찍어보자
folium을 이용해보자

 

 

# 210.py

 

import pandas as pd

marathon_2017 = pd.read_csv('data_boston/marathon_results_2017.csv')
marathon_2017.loc[:, '10K':'Pace']
print(marathon_results_2017.shape)
marathon_2017.head(5)

 

## 시간 데이터 object형태를 int형태로 바꿔준다.

# for문 사용 안하는 방식
marathon_2017['5K'] = pd.to_timedelta(marathon_2017['5K']).astype('m8[s]').astype(np.int64)
marathon_2017['15K'] = pd.to_timedelta(marathon_2017['15K']).astype('m8[s]').astype(np.int64)
marathon_2017['20K'] = pd.to_timedelta(marathon_2017['20K']).astype('m8[s]').astype(np.int64)
marathon_2017['25K'] = pd.to_timedelta(marathon_2017['25K']).astype('m8[s]').astype(np.int64)
marathon_2017['Half'] = pd.to_timedelta(marathon_2017['Half']).astype('m8[s]').astype(np.int64)
marathon_2017['30K'] = pd.to_timedelta(marathon_2017['30K']).astype('m8[s]').astype(np.int64)
marathon_2017['35K'] = pd.to_timedelta(marathon_2017['35K']).astype('m8[s]').astype(np.int64)
marathon_2017['40K'] = pd.to_timedelta(marathon_2017['40K']).astype('m8[s]').astype(np.int64)
marathon_2017['Pace'] = pd.to_timedelta(marathon_2017['Pace']).astype('m8[s]').astype(np.int64)
marathon_2017['Official Time'] = pd.to_timedelta(marathon_2017['Official Time']).astype('m8[s]').astype(np.int64)
marathon_2017.loc[:, '10K':'Pace']
import numpy as np 

# pd.to_timedelt() 적용하기
# .astype('m8[s]').astype(np.int64) 적용하기 : 초 단위로 바꾼 후, int 형태로 변환

points = ['5K', '10K', '15K', '20K', 'Half', '25K', '30K', '35K', '40K', 'Pace', 'Official Time']
for point in points:
    marathon_2017[point] = pd.to_timedelta(marathon_2017[point]).astype('m8[s]').astype(np.int64)
# 시간 데이터가 int 형태로 바뀌었는지 확인
marathon_2017.loc[:, '10K':'Pace'].head(10)
# 참가자별 위치 파악하기
check_time = 7200    # 2시간
Lat = 0
Long = 0
Location = ''

# 5K, 10K, 15K, 20K, 25K, 30K, 35K, 40K
points = [[42.247835,-71.474357], [42.274032,-71.423979], [42.282364,-71.364801], [42.297870,-71.284260],
          [42.324830,-71.259660], [42.345680,-71.215169], [42.352089,-71.124947], [42.351510,-71.086980]]
%%time
# 26000개 행을 돌면서 위치가 어디인지 판단한다.
# iterrows() : 각각의 행을 돌아라는 뜻
marathon_location = pd.DataFrame(columns=['Lat','Long'])

for index, record in marathon_2017.iterrows():
    if (record['40K'] < check_time):
        Lat = points[7][0]
        Long = points[7][1]
    elif (record['35K'] < check_time):
        Lat = points[6][0]
        Long = points[6][1]
    elif (record['30K'] < check_time):
        Lat = points[5][0]
        Long = points[5][1]
    elif (record['25K'] < check_time):
        Lat = points[4][0]
        Long = points[4][1]
    elif (record['20K'] < check_time):
        Lat = points[3][0]
        Long = points[3][1]
    elif (record['15K'] < check_time):
        Lat = points[2][0]
        Long = points[2][1]
    elif (record['10K'] < check_time):
        Lat = points[1][0]
        Long = points[1][1]
    elif (record['5K'] < check_time):
        Lat = points[0][0]
        Long = points[0][1]
    else:    
        Lat = points[0][0]
        Long = points[0][1]
    marathon_location = marathon_location.append({'Lat' : Lat, 'Long' : Long}, ignore_index=True)
print(marathon_location.shape)
marathon_location.head(10)

 

points 별로 카운트 해주기

marathon_count = marathon_location.groupby(['Lat', 'Long']).size().reset_index(name='Count')
marathon_count
marathon_count.Count
import matplotlib.pyplot as plt

# figure size 지정
plt.figure(figsize=(20, 10))

# Bubble chart 적용 (s옵션 : bubble size)
plt.scatter(marathon_count.Lat, marathon_count.Long, s=marathon_count.Count, alpha=0.5)

# 타이틀, 라벨 달기
plt.title('Runners location at 2nd hours')
plt.xlabel('Latitude')
plt.ylabel('Longitude')

# 위치 별 Count값 넣기
for i, txt in enumerate(marathon_count.Count):
    plt.annotate(txt, (marathon_count.Lat[i], marathon_count.Long[i]), fontsize=18)

plt.show()

 

Folium을 이용해서 지도 위에 그리기

!pip install folium
import folium
from folium.plugins import HeatMap

# Folium marathon map 그리기
marathon_map = folium.Map(location=[42.324830, -71.259660],
                         tiles='OpenStreetMap',  # 지도 타일 형태  # tiles='Stamen Toner', 'Stamen Terrain'
                         zoom_start=11)

HeatMap(marathon_count, radius=25).add_to(marathon_map)
marathon_map
folium - tiles에 적용할 수 있는 곳
https://python-graph-gallery.com/288-map-background-with-folium/