Skip to content
Snippets Groups Projects
Commit 26037cc2 authored by niklasfranz's avatar niklasfranz
Browse files

Merge branch 'master' into 19-logging

parents 95d24639 05a2363d
No related branches found
No related tags found
1 merge request!10Merge #19-logging into into master
%% Cell type:markdown id: tags:
# Notebook for Coders
%% Cell type:code id: tags:
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Anxiety in Computer-Gamers: differences, similiratires and learnings # Anxiety in Computer-Gamers: differences, similiratires and learnings
# Overview # Overview
In this project we decided to analyze anxiety in Gamers. We picked the dataset from kaggle because it intersected our personal interests. The data and survey can be found [here](https://www.kaggle.com/datasets/divyansh22/online-gaming-anxiety-data) In this project we decided to analyze anxiety in Gamers. We picked the dataset from kaggle because it intersected our personal interests. The data and survey can be found [here](https://www.kaggle.com/datasets/divyansh22/online-gaming-anxiety-data)
The data was acquired by a survey published and shared online. This way everyone could participate. For us that also means taking into account that the distribution and answers can be scewed. The data was acquired by a survey published and shared online. This way everyone could participate. For us that also means taking into account that the distribution and answers can be scewed.
## Motivation - ## Motivation -
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
from src.Dataset import Dataset from src.Dataset import Dataset
from src.Plotter import Plotter
dataset = Dataset("data\GamingStudy_data.csv") dataset = Dataset("data\GamingStudy_data.csv")
dataframe = dataset.get_dataframe() dataframe = dataset.get_dataframe()
print(dataset) print(dataset)
plotter = Plotter(dataset)
``` ```
%% Output
<src.Dataset.Dataset object at 0x000001CD53BDA250>
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Data Exploration # Data Exploration
Because the data was accumulated in a semi-professional way for a pre-study we had to clean it up and make some changes. Because the data was accumulated in a semi-professional way for a pre-study we had to clean it up and make some changes.
Some columns could be answered with an open text field. Naturally the answeres in those columns are very diversified and hard to analyze. Some columns could be answered with an open text field. Naturally the answeres in those columns are very diversified and hard to analyze.
#### Affected Columns #### Affected Columns
+ Whyplay + Whyplay
+ Earnings + Earnings
+ League + League
In the following we will explain if and how we used these columns. In the following we will explain if and how we used these columns.
Stuff like deleted columns, general overview of the distribution (men women, games, platform) and problems with it Stuff like deleted columns, general overview of the distribution (men women, games, platform) and problems with it
%% Cell type:markdown id: tags:
## Distribution of Participants
### Gender
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
""" plotter.distribution_plot("Gender")
4 Plots/Way to shows the distribution of pass
Gender,
Platform (where they found the survey)
Games Top 5
and Console
"""
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Explanation of technical terms ## Explanation of technical terms
### SPIN ### SPIN
SPIN stands for Social Phobia Inventory SPIN stands for Social Phobia Inventory
The SPIN is a standardized set of 17 question. After answering the questionnaire a “SPIN” value is calculated which is effective for screening for and measuring the severity of social anxiety disorder The SPIN is a standardized set of 17 question. After answering the questionnaire a “SPIN” value is calculated which is effective for screening for and measuring the severity of social anxiety disorder
1. I am afraid of people in authority. 1. I am afraid of people in authority.
2. I am bothered by blushing in front of people. 2. I am bothered by blushing in front of people.
3. Parties and social events scare me. 3. Parties and social events scare me.
4. I avoid talking to people I don’t know. 4. I avoid talking to people I don’t know.
5. Being criticized scares me a lot. 5. Being criticized scares me a lot.
6. I avoid doing things or speaking to people for fear of embarrassment. 6. I avoid doing things or speaking to people for fear of embarrassment.
7. Sweating in front of people causes me distress. 7. Sweating in front of people causes me distress.
8. I avoid going to parties. 8. I avoid going to parties.
9. I avoid activities in which I am the center of attention. 9. I avoid activities in which I am the center of attention.
10. Talking to strangers scares me. 10. Talking to strangers scares me.
11. I avoid having to give speeches. 11. I avoid having to give speeches.
12. I would do anything to avoid being criticized. 12. I would do anything to avoid being criticized.
13. Heart palpitations bother me when I am around people. 13. Heart palpitations bother me when I am around people.
14. I am afraid of doing things when people might be watching. 14. I am afraid of doing things when people might be watching.
15. Being embarrassed or looking stupid are among my worst fears. 15. Being embarrassed or looking stupid are among my worst fears.
16. I avoid speaking to anyone in authority. 16. I avoid speaking to anyone in authority.
17. Trembling or shaking in front of others is distressing to me. 17. Trembling or shaking in front of others is distressing to me.
### GAD ### GAD
is a mental and behavioral, disorder, specifally an anxiety disorder characterized by excessive, uncontrollable and often irrational worry about events or activities. There are specific questionaires you can use to evaluate the disorder. In the questionnaire the minimum is 0 and maximum is 21 is a mental and behavioral, disorder, specifally an anxiety disorder characterized by excessive, uncontrollable and often irrational worry about events or activities. There are specific questionaires you can use to evaluate the disorder. In the questionnaire the minimum is 0 and maximum is 21
#### Worries of concern #### Worries of concern
- Health - Health
- Finances - Finances
- Death - Death
- Family - Family
- Relationships - Relationships
- Work - Work
#### Symptoms #### Symptoms
- Excessive worry - Excessive worry
- Restlessness, - Restlessness,
- Low Concentration - Low Concentration
- Trouble sleeping - Trouble sleeping
- Exhaustion / Fatigablity - Exhaustion / Fatigablity
- Irritability - Irritability
- Sweating - Sweating
- Trembling (Muscle contraction) - Trembling (Muscle contraction)
In the questionnaire the question target these symptoms and worries and summarize them into a score between 0 and 21. In the questionnaire the question target these symptoms and worries and summarize them into a score between 0 and 21.
### SWL ### SWL
#### Explanation #### Explanation
The survey has 5 questions. You fill it in yourself (not a psychiatrist). The survey has 5 questions. You fill it in yourself (not a psychiatrist).
For each question, you choose any integer between 1 (highly disagree) to 7 (highly agree). For each question, you choose any integer between 1 (highly disagree) to 7 (highly agree).
In general, lower numbers mean you are less satisfied with life in a certain way. In general, lower numbers mean you are less satisfied with life in a certain way.
This means you can score between 5 (least satisfied) to 35 (most satisfied). This means you can score between 5 (least satisfied) to 35 (most satisfied).
#### Interpretation #### Interpretation
The (total) SWL score can be interpreted as: The (total) SWL score can be interpreted as:
- 31 - 35 Extremely satisfied - 31 - 35 Extremely satisfied
- 26 - 30 Satisfied - 26 - 30 Satisfied
- 21 - 25 Slightly satisfied - 21 - 25 Slightly satisfied
- 20 Neutral - 20 Neutral
- 15 - 19 Slightly dissatisfied - 15 - 19 Slightly dissatisfied
- 10 - 14 Dissatisfied - 10 - 14 Dissatisfied
- 5 - 9 Extremely dissatisfied - 5 - 9 Extremely dissatisfied
A more detailed interpretation can be found [here](http://labs.psychology.illinois.edu/~ediener/Documents/Understanding%20SWLS%20Scores.pdf). A more detailed interpretation can be found [here](http://labs.psychology.illinois.edu/~ediener/Documents/Understanding%20SWLS%20Scores.pdf).
Residents of developed nations (e.g. DE) usually score 20-24. Residents of developed nations (e.g. DE) usually score 20-24.
#### Questions #### Questions
____ In most ways my life is close to my ideal.<br> ____ In most ways my life is close to my ideal.<br>
____ The conditions of my life are excellent.<br> ____ The conditions of my life are excellent.<br>
____ I am satisfied with my life.<br> ____ I am satisfied with my life.<br>
____ So far I have gotten the important things I want in life.<br> ____ So far I have gotten the important things I want in life.<br>
____ If I could live my life over, I would change almost nothing.<br> ____ If I could live my life over, I would change almost nothing.<br>
--- ---
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Analysis # Analysis
## Preprocessing ## Preprocessing
* Explained new columns and why we did that * * Explained new columns and why we did that *
Some columns gave the options to write individual responses. Naturally those are not useful in data analysis. In some cases we cleaned the columns and changes the unusual cases to "Other"/"NA" Some columns gave the options to write individual responses. Naturally those are not useful in data analysis. In some cases we cleaned the columns and changes the unusual cases to "Other"/"NA"
### Cleaned Columns ### Cleaned Columns
+ "Whyplay" + "Whyplay"
+ Accept + Accept
## Normalizing the Data ## Normalizing the Data
### Creating ["Is_narcissist"] ### Creating ["Is_narcissist"]
### Creating ["Anxiety_score"] ### Creating ["Anxiety_score"]
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Executing and showing new columns # Executing and showing new columns
dataset.get_combined_anxiety_score(dataset.get_dataframe()) dataset.get_combined_anxiety_score(dataset.get_dataframe())
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Creating ["Is_competetive"] ### Creating ["Is_competetive"]
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Q1 - Which gamers are more anxiety prone ? ## Q1 - Which gamers are more anxiety prone ?
Text ....... Text .......
We compare We compare
### Women vs Men ### Women vs Men
Explanation Explanation
![Example Plot](https://cdn.discordapp.com/attachments/806128836332879924/1127988009627832320/Unbenannt.png) ![Example Plot](https://cdn.discordapp.com/attachments/806128836332879924/1127988009627832320/Unbenannt.png)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
"""# SIDE BY SIDE PLOTS """# SIDE BY SIDE PLOTS
# LEFT = LINE Graph distribution of Anxiety Score Related to Group # LEFT = LINE Graph distribution of Anxiety Score Related to Group
# RIGHT = Stacked Bars comparing the GROUP with = # RIGHT = Stacked Bars comparing the GROUP with =
# 1.[Work] - 4 Bars # 1.[Work] - 4 Bars
# 2.[Degree] - 5 Bars # 2.[Degree] - 5 Bars
# 3.[Whyplay ] - 4 Bars (Everything until "All of them") # 3.[Whyplay ] - 4 Bars (Everything until "All of them")
""" """
# #
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Competetive vs Easy Going Players ### Competetive vs Easy Going Players
Explanation Explanation
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
"""# SIDE BY SIDE PLOTS """# SIDE BY SIDE PLOTS
# LEFT = LINE Graph distribution of Anxiety Score Related to Group # LEFT = LINE Graph distribution of Anxiety Score Related to Group
# RIGHT = Stacked Bars comparing the GROUP with = # RIGHT = Stacked Bars comparing the GROUP with =
# 1.[Work] - 4 Bars # 1.[Work] - 4 Bars
# 2.[Degree] - 5 Bars # 2.[Degree] - 5 Bars
# 3.[Whyplay ] - 4 Bars (Everything until "All of them")""" # 3.[Whyplay ] - 4 Bars (Everything until "All of them")"""
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Narcissist vs Non-Narcissist ### Narcissist vs Non-Narcissist
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
""""# SIDE BY SIDE PLOTS """"# SIDE BY SIDE PLOTS
# LEFT = LINE Graph distribution of Anxiety Score Related to Group # LEFT = LINE Graph distribution of Anxiety Score Related to Group
# RIGHT = Stacked Bars comparing the GROUP with = # RIGHT = Stacked Bars comparing the GROUP with =
# 1.[Work] - 4 Bars # 1.[Work] - 4 Bars
# 2.[Degree] - 5 Bars # 2.[Degree] - 5 Bars
# 3.[Whyplay ] - 4 Bars (Everything until "All of them")""" # 3.[Whyplay ] - 4 Bars (Everything until "All of them")"""
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Q2 - Correlations between played hours and one's well being. ## Q2 - Correlations between played hours and one's well being.
**Maybe we can even add if hours watching Streams effect it** **Maybe we can even add if hours watching Streams effect it**
For research question two we wanted to know if there is a correlation
between played hours and the player's well being. We went into the question
with the expectation that players which play longer hours are more anxiety prone
and less satisfied with life than those who play less. If that would be the
case, a positive correlation of hours played and our combined anxiety score
variable would be expected. We want to take a look at the data using a scatter-
plot, showing the correlation of both variables of interest, using the
plot_scatterplot() function of our Plotter class:
code below: plotter.plot_scatterplot("Hours", "Anxiety_score")
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
""" plotter.plot_scatterplot("Hours", "Anxiety_score")
""" #Still needs to be prettier
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Q3 - Effect of the reason for playing on the satisfaction with life ## Q3 - Effect of the reason for playing on the satisfaction with life
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
In this question, we visualise and discuss how a player's reason for playing and their satisfaction with life score (SWL) effect each other. In this question, we visualise and discuss how a player's reason for playing and their satisfaction with life score (SWL) effect each other.
Although a description of the columns are not given, we briefly describe them as follows: Although a description of the columns are not given, we briefly describe them as follows:
* "improving": players are competitive and derive satisfaction from outperforming themselves and others. * "improving": players are competitive and derive satisfaction from outperforming themselves and others.
* "winning": players are more competitive than those who wish to improve, and derive immense satisfaction from outperforming. Players who play to win experience games more intensely than those in other categories. * "winning": players are more competitive than those who wish to improve, and derive immense satisfaction from outperforming. Players who play to win experience games more intensely than those in other categories.
* "having fun": players are not competitive. They are not particularly invested in improving or the outcome of the game, but instead play as a form of recreation. This does not imply the intensity or difficulty of a game is easy; a challenging game can still be fun as long as players derive satisfaction not from the outcome, but from the gameplay or environs (friends, etc). * "having fun": players are not competitive. They are not particularly invested in improving or the outcome of the game, but instead play as a form of recreation. This does not imply the intensity or difficulty of a game is easy; a challenging game can still be fun as long as players derive satisfaction not from the outcome, but from the gameplay or environs (friends, etc).
* "relaxing": players are playing to relax, and may play games to reduce their anxiety. * "relaxing": players are playing to relax, and may play games to reduce their anxiety.
* "all of the above": players in this category are generally competitive but also see the importance of enjoying the game itself. * "all of the above": players in this category are generally competitive but also see the importance of enjoying the game itself.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
""" Horizontal bar chart, one row for every reason for with top width """ Horizontal bar chart, one row for every reason for with top width
# Anxiety colored in for the amount of anxiety in that group # Anxiety colored in for the amount of anxiety in that group
""" """
print("Category distribution:") print("Category distribution:")
print(dataframe.groupby("whyplay").size().sort_values(ascending=False)) print(dataframe.groupby("whyplay").size().sort_values(ascending=False))
fig, ax = plt.subplots(figsize=(8,6)) fig, ax = plt.subplots(figsize=(8,6))
order = ["relaxing", "having fun", "other", "improving", "winning"] order = ["relaxing", "having fun", "other", "improving", "winning"]
fig.suptitle("") fig.suptitle("")
dataframe[dataframe["whyplay"] != "other"].boxplot(column=["SWL_T"], by="whyplay", ax=ax) dataframe[dataframe["whyplay"] != "other"].boxplot(column=["SWL_T"], by="whyplay", ax=ax)
pass pass
``` ```
%% Output
Category distribution:
whyplay
having fun 5105
improving 4661
winning 1977
relaxing 623
other 424
all of the above 48
dtype: int64
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
As seen in this plot, we can discover the following: As seen in this plot, we can discover the following:
* On average, those who play to have fun are more satisfied with life than any other group. * On average, those who play to have fun are more satisfied with life than any other group.
* We find this outcome reasonable. Those that are more satisfied with life generally do not rely so much on gaming as a means of fulfillment. * We find this outcome reasonable. Those that are more satisfied with life generally do not rely so much on gaming as a means of fulfillment.
* As expected, those who play to win are the least satisfied with their lives, as they disproportionately value being the best over enjoying the game. * As expected, those who play to win are the least satisfied with their lives, as they disproportionately value being the best over enjoying the game.
* Interestingly, those who play to relax are also less satisfied with their lives on average. This may be because this category of players are not satisfied with life and use gaming as a means to destress. * Interestingly, those who play to relax are also less satisfied with their lives on average. This may be because this category of players are not satisfied with life and use gaming as a means to destress.
* Those who selected "all of the above" have a much smaller range of SWL metrics. This is due to the small sample size. * Those who selected "all of the above" have a much smaller range of SWL metrics. This is due to the small sample size.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Effects of income level (`work`) and education level (`Degree`) on the reason to play ### Effects of income level (`work`) and education level (`Degree`) on the reason to play
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
"""#Overlaying Histogram """#Overlaying Histogram
# Histogram for the income level Y = %, X = low to high # Histogram for the income level Y = %, X = low to high
# One in Green for the income # One in Green for the income
# One in Red for the Anxiety for those people """ # One in Red for the Anxiety for those people """
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Q4 - Gamers from different countries ## Q4 - Gamers from different countries
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
1. Do they play different games ? 1. Do they play different games ?
1. Are they reacting differently to those games 1. Are they reacting differently to those games
2. Is the amount of educated players similar 2. Is the amount of educated players similar
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
![Scatter](https://cdn.discordapp.com/attachments/1127973734884581386/1127973829344493679/image.png) ![Scatter](https://cdn.discordapp.com/attachments/1127973734884581386/1127973829344493679/image.png)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
"""#### Analyze the countries amounting to Top 7 or 90% of the survey. """#### Analyze the countries amounting to Top 7 or 90% of the survey.
#Q4.MAP PLOT = Most played game per country (Dont do it if its League everywhere. ) #Q4.MAP PLOT = Most played game per country (Dont do it if its League everywhere. )
#Q4 MAP PLOT = Heat Map with redder areas for more Anxiety in the country. #Q4 MAP PLOT = Heat Map with redder areas for more Anxiety in the country.
#Q1.2 Grouped Bar Chart with the top game next to the "Anxiety Score" #Q1.2 Grouped Bar Chart with the top game next to the "Anxiety Score"
#2 Scatter PLot like in the example """ #2 Scatter PLot like in the example """
``` ```
......
...@@ -15,7 +15,7 @@ class Plotter: ...@@ -15,7 +15,7 @@ class Plotter:
self.df = dataset.get_dataframe() self.df = dataset.get_dataframe()
def customize_plot(self, fig, ax, styling_params) -> None: def customize_plot(self, fig, ax, styling_params) -> None:
""" customize_plot """customize_plot
Args: Args:
fig (plt.figure.Figure), fig (plt.figure.Figure),
...@@ -72,7 +72,7 @@ class Plotter: ...@@ -72,7 +72,7 @@ class Plotter:
def plot_categorical_bar_chart( def plot_categorical_bar_chart(
self, category1, category2, styling_params={} self, category1, category2, styling_params={}
) -> None: ) -> None:
""" plot a categorical bar chart. """plot a categorical bar chart.
Args: Args:
category1 (str, must be present as a column in the dataset), category1 (str, must be present as a column in the dataset),
...@@ -119,7 +119,7 @@ class Plotter: ...@@ -119,7 +119,7 @@ class Plotter:
def plot_categorical_boxplot( def plot_categorical_boxplot(
self, target, category, styling_params={} self, target, category, styling_params={}
) -> None: ) -> None:
""" plot a categorical boxplot. """plot a categorical boxplot.
Args: Args:
target (str, must be present as a column in the dataset), target (str, must be present as a column in the dataset),
...@@ -130,32 +130,7 @@ class Plotter: ...@@ -130,32 +130,7 @@ class Plotter:
Returns: Returns:
None None
""" """
# implementing sensible logging and error catching
if (type(target) != str):
logging.error("parameter target should be a string.")
raise ValueError("parameter target should be a string.")
if not (target in self.df.columns):
logging.error("parameter target cannot be found in the dataset.")
raise ValueError(
"parameter target cannot be found in the dataset."
)
if (type(category) != str):
logging.error("parameter category should be a string.")
raise ValueError("parameter category should be a string.")
if not (category in self.df.columns):
logging.error("parameter category cannot be found in the dataset.")
raise ValueError(
"parameter category cannot be found in the dataset."
)
if (type(styling_params) != dict):
logging.error("parameter styling params should be a dict.")
raise ValueError("parameter styling params should be a dict.")
# plotting the plot
fig, ax = plt.subplots() fig, ax = plt.subplots()
self.customize_plot(fig, ax, styling_params) self.customize_plot(fig, ax, styling_params)
sns.boxplot(x=category, y=target, data=self.df, palette="rainbow") sns.boxplot(x=category, y=target, data=self.df, palette="rainbow")
...@@ -163,7 +138,7 @@ class Plotter: ...@@ -163,7 +138,7 @@ class Plotter:
def plot_categorical_histplot( def plot_categorical_histplot(
self, target, category, styling_params={}, bins=30 self, target, category, styling_params={}, bins=30
) -> None: ) -> None:
""" plot a categorical hisplot. """plot a categorical hisplot.
Args: Args:
target (str, must be present as a column in the dataset), target (str, must be present as a column in the dataset),
...@@ -215,7 +190,7 @@ class Plotter: ...@@ -215,7 +190,7 @@ class Plotter:
) )
def plot_scatterplot(self, target1, target2, styling_params={}) -> None: def plot_scatterplot(self, target1, target2, styling_params={}) -> None:
""" plot a scatterplot. """plot a scatterplot.
Args: Args:
target1 (str, must be present as a column in the dataset), target1 (str, must be present as a column in the dataset),
...@@ -255,4 +230,23 @@ class Plotter: ...@@ -255,4 +230,23 @@ class Plotter:
# plotting the plot # plotting the plot
fig, ax = plt.subplots() fig, ax = plt.subplots()
self.customize_plot(fig, ax, styling_params) self.customize_plot(fig, ax, styling_params)
ax.scatter(self.df[target1], self.df[target2]) ax.scatter(self.df[target1], self.df[target2])
\ No newline at end of file
def distribution_plot(self, target: str):
"""
distribution_plot _summary_
Args:
target (str): _description_
Returns:
None
"""
grouped_data = self.df.groupby(target).size()
plt.barh(grouped_data.index, grouped_data.values)
print(grouped_data.sort_values(ascending=False))
# print(grouped_data.index)
# print(grouped_data.values)
plt.xlabel("Size")
plt.ylabel(target)
plt.title(f"Distribution of {target}")
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment