Read and format project data
# Include and execute your code here
= pd.read_csv("https://github.com/byuidatascience/data4names/raw/master/data-raw/names_year/names_year.csv") df
Course DS 250
Brayden McAllister
paste your elevator pitch here A SHORT (4-5 SENTENCES) PARAGRAPH THAT DESCRIBES KEY INSIGHTS
TAKEN FROM METRICS IN THE PROJECT RESULTS THINK TOP OR MOST IMPORTANT RESULTS.
Highlight the Questions and Tasks
Write an SQL query to create a new dataframe about baseball players who attended BYU-Idaho. The new table should contain five columns: playerID, schoolID, salary, and the yearID/teamID associated with each salary. Order the table by salary (highest to lowest) and print out the table in your report.
type your results and analysis here
include figures in chunks and discuss your findings in the figure.
::: {#cell-Q1 chart .cell execution_count=4}
My useless chart
:::
::: {#cell-Q1 table .cell .tbl-cap-location-top tbl-cap=‘Not much of a table’ execution_count=5}
year | AK | AR | |
---|---|---|---|
96 | 2006 | 21.0 | 183.0 |
97 | 2007 | 28.0 | 153.0 |
98 | 2008 | 36.0 | 212.0 |
99 | 2009 | 34.0 | 179.0 |
100 | 2010 | 22.0 | 196.0 |
101 | 2011 | 41.0 | 148.0 |
102 | 2012 | 28.0 | 140.0 |
103 | 2013 | 26.0 | 134.0 |
104 | 2014 | 20.0 | 114.0 |
105 | 2015 | 28.0 | 121.0 |
:::
This three-part question requires you to calculate batting average (number of hits divided by the number of at-bats)
Write an SQL query that provides playerID, yearID, and batting average for players with at least 1 at bat that year. Sort the table from highest batting average to lowest, and then by playerid alphabetically. Show the top 5 results in your report.
Use the same query as above, but only include players with at least 10 at bats that year. Print the top 5 results.
Now calculate the batting average for players over their entire careers (all years combined). Only include players with at least 100 at bats, and print the top 5 results.
type your results and analysis here
include figures in chunks and discuss your findings in the figure.
::: {#cell-Q2 chart .cell execution_count=7}
My useless chart
:::
::: {#cell-Q2 table .cell .tbl-cap-location-top tbl-cap=‘Not much of a table’ execution_count=8}
year | AK | AR | |
---|---|---|---|
96 | 2006 | 21.0 | 183.0 |
97 | 2007 | 28.0 | 153.0 |
98 | 2008 | 36.0 | 212.0 |
99 | 2009 | 34.0 | 179.0 |
100 | 2010 | 22.0 | 196.0 |
101 | 2011 | 41.0 | 148.0 |
102 | 2012 | 28.0 | 140.0 |
103 | 2013 | 26.0 | 134.0 |
104 | 2014 | 20.0 | 114.0 |
105 | 2015 | 28.0 | 121.0 |
:::
Pick any two baseball teams and compare them using a metric of your choice (average salary, home runs, number of wins, etc). Write an SQL query to get the data you need, then make a graph using Plotly Express to visualize the comparison. What do you learn?
type your results and analysis here
include figures in chunks and discuss your findings in the figure.
::: {#cell-Q3 chart .cell execution_count=10}
My useless chart
:::
::: {#cell-Q3 table .cell .tbl-cap-location-top tbl-cap=‘Not much of a table’ execution_count=11}
year | AK | AR | |
---|---|---|---|
96 | 2006 | 21.0 | 183.0 |
97 | 2007 | 28.0 | 153.0 |
98 | 2008 | 36.0 | 212.0 |
99 | 2009 | 34.0 | 179.0 |
100 | 2010 | 22.0 | 196.0 |
101 | 2011 | 41.0 | 148.0 |
102 | 2012 | 28.0 | 140.0 |
103 | 2013 | 26.0 | 134.0 |
104 | 2014 | 20.0 | 114.0 |
105 | 2015 | 28.0 | 121.0 |
:::