Data Exploration

Here we can learn about some interesting information on the data I will be using.

We can evaluate some player data for example, CB is the most common player in Soccer. CB is Center back and the least amount is CDM, A central defensive midfielder (CDM) is a vital position in soccer that involves defending, distributing, and supporting the midfield. CDMs are often considered to be one of the most challenging roles in soccer.

plt.figure(figsize=(10, 6))
sns.barplot(x=player_positions.index, y=player_positions.values,palette="plasma",capsize=.2)

plt.title('Most Common Player Positions')
plt.xticks(rotation=45)
plt.show()

plt.figure(figsize=(8, 6))
sns.barplot(x=country_players.index, y=country_players.values,palette="plasma",capsize=.2)

plt.title('Number Players per country')
plt.xticks(rotation=45)
plt.show()

With this code, you can evaluate player skills vs how much it will earn them. Some examples of player skills. These skills include ball control, dribbling, passing, and shooting.

fig, ax = plt.subplots(figsize=(8,5))
plt.scatter(data = df1, x= 'skills', y='wage_eur')
plt.xlabel("skills") 
plt.ylabel("Wage in EUR")
plt.title("skill_moves & wages in EUR", fontsize = 16)
plt.show()

Here you can see typical wages by age.

fig, ax = plt.subplots(figsize=(8,5))
plt.scatter(data = df, x= 'age', y='wage_eur')
plt.xlabel("Age") 
plt.ylabel("Wage in EUR")
plt.title("age & wages in EUR", fontsize = 16)
plt.show()

The big picture displayed here shows the distribution of skills and potential wages for each skill using figsize attribute and subplot.

df_x = df[['shooting','defending','passing','dribbling','pace','wage_eur','potential','overall']]

plt.figure(figsize=(9, 9))

plt.subplots_adjust(left=0.1,
                    bottom=0.1,
                    right=0.9,
                    top=0.9,
                    wspace=0.4,
                    hspace=0.4)

width = 3
height = 3
index = 1
for i in df_x.columns:
    plt.subplot(height, width, index)
    plt.scatter(x=df['physic'],y=df_x[i])
    plt.xlabel('physic')
    plt.ylabel(i)
    plt.xticks(rotation=45)
    index = index + 1

Here we can look at the age distribution of all soccer players in this database.

plt.figure(figsize=(8, 6))
sns.barplot(x=df1.age.value_counts().index, y=df1.age.value_counts().values,palette="plasma",capsize=.2)

plt.xticks(fontsize=15, rotation=90)
plt.yticks(fontsize=15)
plt.title('Age Distribution')
plt.show()