I was a sophomore at Clemson University in 1977 and was taking a course in probability and statistics. We learned the basics of statistical inference and we used our Texas Instrument calculators to compute means, medians, t-scores and more in our class. The problems in our book were represented in this script which, we had to write out on tests and in our homework. For example, the sum of the numbers 2, 4, and 6 could be written as Σ(2 + 4 + 6) = 12. As we approached the end of the semester we received a very large set of numbers which required us to use the university’s computer center. Fortunately for me, my roommate was a math major and volunteered to complete this assignment and I got a “B” in the course.
Thirteen years later I was a graduate student at the University at Buffalo. I took two courses in graduate level statistics where we regularly solved for ANOVA and ANCOVA along with a host of regressions and t-tests. We didn’t use Texas Instrument calculators. We were instructed to use IBM mainframes and we tediously entered our data into text files that had to be formatted “just so” so that they could run correctly and generate the data we were looking for. What a relief instead of manually pressing the ‘Sum key’ on the calculator and though formatting the text files was tedious it was far superior to the earlier method.
A couple of years later we were solving the same statistical analysis using Microsoft Excel. We could generate ANOVA and ANCOVA and all the other data without the tedious formatting. In the last fifty years we’ve seen tremendous changes in data analysis that came from improved methods of collection which include such as sensors, mobile devices and social media. We have big data tools driven by Python and R which didn’t exist 50 years ago.
We have improved data visualizations which were rudimentary and frequently inaccurate fifty years ago. These methods have improved as our tools have improved. This in turns has led to more accurate interpretation of results. The ability to use a variety of techniques and technologies has led to a greater understanding of the world we live in.
Now we have the development of machine learning, learning algorithms and artificial intelligence that enable computers to do the work that was hitherto reserved for humans. These changes have enabled more sophisticated and faster data analysis.
ChatGPT can automate data preparation tasks, allowing data analysts to focus on higher-value tasks. It offers advanced natural language processing capabilities to provide insights, has a natural language interface for better user experience, and integrates with data visualization tools to present data insights more interactively. Ultimately, ChatGPT can revolutionize the way data is analyzed, leading to more informed decisions faster and more efficiently.
Should we return to 1977 with TI calculators and punch cards or dare we enter a new age that can enable us to have more diverse and accurate representations of the world we live in