Select any dataset that contains more than 300 observations with at least 10 attributes from https://archive.ics.uci.edu or https://www.kaggle.com or any other online free data repository. Perform det

Select any dataset that contains more than 300 observations with at least 10 attributes from https://archive.ics.uci.edu or https://www.kaggle.com or any other online free data repository. Perform detailed analyses on the selected data by using ONE (1) data reduction method and ONE (1) clustering method of your choice. Explain your choices and discuss your results.

NOTES:

• The link and the description of the selected dataset should be provided, and the dataset should NOT have been used in the lectures or labs of the course.

• Describe data set information such as number of instances/ features/ attributes/ columns, number of dataset/rows, area/ domain/ field, and/or missing value(s) if any.

• Any preprocessing method (e.g. removal or filling of empty cells) performed on the original data needs to be fully described and shown.

• Your analyses shall include the descriptions of your Python codes and plots.

multiple regression

Review “Multiple Regression Models Case Study: Web Video on Demand”(

ATTACHED

) for this topic’s case study, predicting advertising sales for an Internet video-on-demand streaming service.

After developing Regression Model A and Regression Model B, prepare a 250-500-word executive summary of your findings. Explain your approach and evaluate the outcomes of your regression models.

Submit a copy of the Excel spreadsheet file you used to design your regression model and to determine statistical significance.


Note: use Excel’s regression option to perform the regression.


Use an Excel spreadsheet file for the calculations and explanations. Cells should contain the formulas

(i.e., if a formula was used to calculate the entry in that cell).


use the “Multiple Regression Dataset” Excel resource to complete this assignment. (ATTACHED)

Prepare the written portion of this assignment according to the guidelines found in the APA Style Guide.


This assignment uses a rubric(ATTACHED). Please review the rubric prior to beginning the assignment to become familiar with the expectations for successful completion.

23 hours ago

ATTACHMENTS

multiple regression
Multiple Regression Models Case Study: Web Video on Demand Web Video on Demand (WVOD) is an Internet video-on-demand streaming service. The company offers a subscription service for $5.99/month, which includes access to all programming and 30-second commercial intervals. In the last year, the company has recently begun producing its own programming, including 30-, 60-, and 120-minute television shows, specials, and films. Programming has been developed for teen audiences as well as adults. The following data represent the amount of money brought in through advertising sales, the average number of viewers, length of the program, and the average viewer age per program. Advertising Sales ($) Average # of Viewers (Millions) Length of Program (Minutes) Average Viewer Age (Years) 28,000 10.1 30 30 25,500 11.4 30 25 31,000 19.9 60 30 29,000 13.6 60 38 20,500 12.5 60 20 14,500 3.5 30 15 27,000 15.1 60 24 23,500 3.7 30 17 19,500 4.3 30 19 23,000 12.2 120 45 18,000 5.1 120 19 29,500 15.9 60 28 30,000 16.8 120 31 25,000 8.5 120 58 22,500 9.1 30 43 The WVOD executives are in the process of evaluating a partnership with several independent filmmakers to fund and distribute socially conscious and diverse programming. The executives have asked for regression models to be developed based on specific needs. The three regression model requests and programming details are included below. The WVOD executives would like to see a regression model that predicts the amount of advertising sales based on the number of viewers and the length of the program. Develop this regression model (“Regression Model A”). Web Video on Demand would like to acquire a 60-minute documentary special about social media and bullying. The special is aimed at teen viewers and is estimated to bring in 3.2 million viewers. Based on the regression model, predict the advertising sales that could be generated by the special. The WVOD executives would also like to see a regression model that predicts the amount of advertising sales based on the number of viewers, the length of the program, and the average viewer age. Develop this regression model (“Regression Model B”). Web Video on Demand may acquire a 2-hour film that was a hit with critics and audiences at several international film festivals. Initial customer surveys indicate that the film could bring in 14.1 viewers and the average viewer age would be 32. Use this information to predict the advertising sales. © 8/14/19. Grand Canyon University. All Rights Reserved.

Descriptive Stats

For your paper, you will run an analysis on data you have gathered or acquired. There are two sections for this paper. In the first section you will present your results (what do the numbers say). You will need tables or figures to show your results (place these at the end of your paper, give each table or figure its own page and label them in numeric order, but indicate in your paper where they belong see the samples for how to do this). You must include as your first table, a table of descriptive statistics (mean, SD, minimum, maximum) for all your variables. Failure to do so will result in an automatic 5 point penalty. You will also want to discuss which variables are significant (also mention which ones fail to reach significance), the direction of the significant variables, and the substantive impact of your variables (if your method allows you to do so).

The second section is your conclusion. Summarize what you have found, why your finding matters, and the implication of your findings. Implications can range from empirical to normative. You may also talk new questions your findings raise (new answers only lead to new questions).

Statistics study analysis

I.               Use the Sleep Patterns and Energy Drinks study below, and answer the following questions:

A. What sampling technique was used for the study?

B. What was the level of measurement for each type of data collected in the study?

C. What descriptive statistics were used in the study? Name them all.

D.  What hypothesis was tested in the study?

E.  What type of test was run on the data?

F.  What was the significance level for the hypothesis test?

G.  What conclusions did the researcher draw?

H.  Were the conclusions appropriate for the study?

I. What were the limitations of the study?

Sleep Patterns and Energy Drinks

A researcher decides to study the relationship between the consumption of energy drinks and the amount of sleep that college students report.  He does a survey of students at a local college where he has used a random numbering scheme to select 100 students to send the survey to. Initially his survey questions included questions about class in school—Freshman—Senior, age, marital status, class load, hours of sleep on average during the week and on the weekend, number of energy drinks consumed on week days and on the weekends.  When he refined his questions, he decided to include questions about consumption of other forms of caffeine as well—coffee, teas, and soda. He believes that students who consume energy drinks are getting less sleep than those who do not, but he decides that he needs the other caffeine data as well and decides to split his group up into three groups once he collects the data—no caffeine use, caffeine use but no energy drinks, and energy drink users (whether or not they use caffeine in other forms).

He gets 72 surveys returned and divides them up into groups according to caffeine use as indicated above with the following results:

No caffeine group n = 15, sleep per night mean = 7.23 hours, sd = 1.72 hours

Caffeine, no energy group n = 27, sleep per night mean = 7.17 hours, sd = 1.58

Energy drink group n = 30, sleep per nigh mean = 6.42, sd = 1.87

He runs a statistical comparison between the groups and finds F= 7.923, p = .032 .

He concludes that there is a significant difference between the groups at p =05 and decides to do further tests to see where the difference lies.

Response to Another Student Rotation Discussion

DIRECTIONS: READ THE ATTACHED DISCUSSION BELOW BY ANOTHER STUDENT AND:


Respond by expanding on one of the points made. Cite references.

Response to Another Student Rotation Discussion
DIRECTIONS: READ THE DISCUSSION BELOW BY ANOTHER STUDENT AND: Respond by expanding on one of the points made. Cite references. Rotation _Jessica Coutain Collapse Top of Form Unit 3 Discussion 2 Factor rotation is a technique used to transform factors gained from factor analysis (FA) so that the factor loadings that are small would be minimized, and factor loadings that are large would be maximized, in order to enhance the interpretability of these factors (Field, 2013; Warner, 2013, p. 848). The major difference between orthogonal and oblique rotation is that the orthogonal rotation preserves the orthogonality of the factors (i.e., the correlations between them remain equal to zero), whereas the oblique rotation allows the new factors to be correlated. Some advantages of orthogonal rotations are that they provide results that are simpler due to the preserved orthogonality of factors, and might be easier to interpret. Orthogonal rotations also produce lower sampling errors, and the results from these rotations are more likely to be replicated in further studies (Kieffer, 1998, p. 12). An important limitation is that “underlying factors” are rarely completely uncorrelated if they correspond to something in reality, so orthogonal rotations tend to oversimplify the model (Field, 2013; Kieffer, 1998). On the contrary, oblique rotations allow for the best fit of the model to the gathered data, and they may better correspond to the scholar’s view about the world (Kieffer, 1998; Warner, 2013). However, a limitation is that these results are more difficult to interpret, for oblique rotations produce more data to be assessed (Kieffer, 1998, p. 16). A researcher might want to use oblique rotation if they believe that the orthogonal rotation oversimplifies the data, because “hidden factors,” as was noted, will rarely be uncorrelated if they are to reflect something in reality (Field, 2013, sec. 17.4.6). Also, a substantial reason might be that the positioning of clusters of the original variables is such that an orthogonal rotation will be unsuccessful in maximizing the factor loadings, whereas an oblique rotation will be much more effective (Field, 2013, sec. 17.4.6). References Field, A. (2013). Discovering statistics using IBM SPSS Statistics (4th ed.). Thousand Oaks, CA: SAGE Publications. Kieffer, K. M. (1998). Orthogonal versus oblique factor rotation: A review of the literature regarding the pros and cons. Retrieved from http://files.eric.ed.gov/fulltext/ED427031.pdf Warner, R. M. (2013). Applied statistics: From bivariate through multivariate techniques (2nd ed.). Thousand Oaks, CA: SAGE Publications. Bottom of Form

Must have knowldeg with numerical method (simpson’s 1/3 intergal) and compute it in matlab as programing code

Question w in the attached document

This question uses the following data obtained during a cell culture process.

A.Write your own MATLAB function that computes the Simpson’s 1/3 integral

between any two tabulated time points.

B. Use this function to compute the total amount of oxygen absorbed between hours

141 and 146.

c. Compare the result you obtained for part b. with results you find using the

MATLAB commands ‘integral’ and ‘trapz’. This will require you to read the documentation

associated with each function.

Develop a matrix for the following situation and calculate accuracy, true negative rate, and true positive rate. The illustrations from this unit’s studies are linked in Resources. Comment on the util

Develop a matrix for the following situation and calculate accuracy, true negative rate, and true positive rate. The illustrations from this unit’s studies are linked in Resources.

Comment on the utility of the test—does it work well? Do you think police departments would find this test acceptable? What about defense attorneys? Interpret accuracy, true negative rate, and true positive rate for this scenario:

Police departments need to test drivers for blood alcohol levels when they suspect the driver may be under the influence of alcohol. Blood alcohol tests are expensive and are invasive, requiring a blood sample. Breathalyzer tests provide preliminary test results that can establish probable cause for police officers to arrest an individual and take them for a formal blood alcohol test. Bringing a person in for a formal test when they really are not drunk creates ill will for a police department. Not detecting a truly drunk driver could lead to a tragic accident. Your firm has developed a new breathalyzer test and has the following results:

One hundred subjects were tested with your new device. Sixty subjects were truly drunk and 40 were not drunk. Of the 60 drunk subjects, the test correctly identified 58 as being drunk and incorrectly identified two as not drunk when they in fact were drunk. Of the 40 not drunk subjects, 22 were correctly identified as not drunk and 18 were incorrectly identified as drunk.

Second, assume that the test above generates an ROC curve labeled B in Figure 5-10 on page 217. Another test is identified that generates an ROC curve labeled A on Figure 5-10 on page 217. Which test is preferred? Why?

I need this today


THIS IS THE STUDENT POST

Nicholas McKayWed May

Hello,

I am choosing Table #5 Successful Movies

My point of view is Lord of the Ring:The Return of the King

I want to see it compared to the other Lord of the Ring movies including the Hobbit Movies

I HAVE ATTACHED TABLE 5 TO THIS ASSIGNMENT/ PLEASE READ INSTRUCTIONS CARFULLY


Post#2 Instructions


Posting an image of a classmate’s chart created in Excel and providing analysis, reasoning for decisions, and/or questions:


Choose a classmate’s post that has not already been responded to.

  1. Develop a properly formatted chart in Excel that demonstrates the message and axes he/she selected.
  2. Include a chart title with message, axis titles, labels, and legend if appropriate.
  3. Paste a screenshot of the chart into the response post on the discussion board.
  4. Notate any questions you have or items that were unclear as you were creating the chart.  This should be at least one paragraph.  Examples of items you could discuss:
  • Did you understand the message?
  • Did the categories suggested fall in line to the message requested?
  • Were you able to create the chart easily from his/her description?
  • Were there questions you had as to the expectations of his/her chart?
  • If everything was easy to follow, explain some of the decisions you made when developing the chart

Standard deviation and mean for Statistics

Submit as a Word document

  1. Find a study that uses mean and standard deviation. Remember that you are looking for studies, not for a website that describes or explains mean and standard deviation.

    • Write one paragraph describing the study and one paragraph that tells what the conclusions were from the study. These two paragraphs together should be a minimum of 150 words.
    • Paste the table or paragraph from the study which contains the mean and standard deviation or attach the entire study if you are unable to paste just the part which contains a reference to the mean and standard deviation.
    • Cite the study in appropriate APA format.

Discussion postings

Read the article reviews from the two students below and post a reflection on each students’ article reviews. Make certain that your response is written in complete sentences and makes reference to what the article is about and how statistics are used in the article. Each of your responses should be a minimum of 150 words in length.


Student #1


Katherine

The study I found examines the relationship between temperature and human mortality. A nine-member research team set out to examine which end of the temperature spectrum (hot or cold weather) exerts a greater deleterious effect on human health in Chengdu, China (co2science.org, 2017). The researchers chose Chengdu due to the large populations of 14.65 million people. By collecting the daily temperature and death record data from January 1, 2011 through December 31, 2014 and estimating the relationship between daily mortality and temperature using a distribution lag model, the researchers discovered that the effect of cold temperatures was ten times larger than warm temperatures co2science.org, 2017).


I found this article interesting because it’s winter and there are frigid temperatures in many areas across the United States. Sometimes I feel like I’m so cold I might die. But I also survived the summer of 2011 in Fort Worth, TX, when I felt so hot that I thought I was going to die. I was pretty surprised to see that death due to cold weather was more prevalent because it seems like there are more ways to warm up than there are to cool off.


Student #2Jessica Eller


Jessica

This study compared musculoskeletal pain and co-morbidity in adults and the impact these effects had on social involvement.   The study consisted of 1,811 (n=1,811) adults ages 18 and older who, based on answers to a questionnaire, complained of pain and associated insomnia which negatively affected his or her social interaction (Baker, 2017).  Patients responded to three different categories of insomnia and his or her baseline results were compared to a 12-month follow up.

The study concluded that pain and insomnia often occur simultaneously which leading to diminished functional ability.  The study suggests that due to the correlation between pain and insomnia it is essential for physicians to assess such complaints allowing the physician to offer necessary interventions.  Since there were different categories defining insomnia there were respective confidence intervals (CI) for each category.  Relating to delayed sleep onset the 95% CI ranged from 1.5-3.5; difficulties maintaining sleep ranged from 4.0-9.1; early wakening from 2.3-4.4; and non-restorative sleep from 2.9-5.9 (Baker, 2017).  There was a considerable correlation between delayed sleep onset and the effects on social involvement at the 12-month follow up.  The final CI ranged from 0.1-.99 with a population proportion of .049 (Baker, 2017).

http://bmcfampract.biomedcentral.com/articles/10.1186/s12875-017-0593-5#Bib1 (Links to an external site.)

This hyperlink directs you to the article and includes all related charts and graphs.