J. Adrian Ayala's Portfolio https://aadatascience.com Data Science Projects and Tips Thu, 29 Aug 2024 23:11:48 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.1 236459716 Shell https://aadatascience.com/2024/08/29/shell/ Thu, 29 Aug 2024 22:54:17 +0000 https://aadatascience.com/?p=92 Steve Bourne wrote the original Bourne shell which appeared in the Seventh Edition Bell Labs Research version of Unix. Many variants have come and gone over time (csh, ksh, and so on).

Below is an example of “Hello World” script

$ echo '#!/bin/sh' > my-script.sh
$ echo 'echo Hello World' >> my-script.sh
$ chmod 755 my-script.sh
$ ./my-script.sh
Hello World
$

A Linux shell is a command-line interpreter that acts as an interface between the user and the kernel, allowing users to interact with the system and execute programs. When a user logs in or opens a console window, the kernel creates a new shell instance. The shell parses and sends commands to the operating system, which then executes the corresponding program. For example, if a user enters ls in the terminal, the shell will execute the ls command. For a primer on Linux commands, check here.

Windows 10 offers a native Shell client for you to use. In the past, it was necessary to use a third-party client such as PuTTY. You can now use the Shell built directly into Windows 10 to access your DreamHost web server without the need to download any other software.

This is also enabled by default, so you do not have to enable it within Windows.

Using the SSH client within the Command Prompt
To use the SSH client, open your Command Prompt. From there you can run SSH commands.

In Windows, you can open up a command prompt on the bottom left, type ‘cmd’ into the search bar

You can also run SQL commands in SSH or use phpMyAdmin, a free and open source administration tool for MySQL and MariaDB. As a portable web application written primarily in PHP, it has become one of the most popular MySQL administration tools, especially for web hosting services.

]]>
92
plotly https://aadatascience.com/2024/08/29/plotly/ Thu, 29 Aug 2024 22:54:02 +0000 https://aadatascience.com/?p=109 Plotly‘s Python graphing library makes interactive, publication-quality graphs. Examples of how to make line plots, scatter plots, area charts, bar charts, error bars, box plots, histograms, heatmaps, subplots, multiple-axes, polar charts, and bubble charts.

]]>
109
random https://aadatascience.com/2024/08/29/random/ Thu, 29 Aug 2024 22:53:53 +0000 https://aadatascience.com/?p=106 The random module implements pseudo-random number generators for various distributions.

For integers, there is uniform selection from a range. For sequences, there is uniform selection of a random element, a function to generate a random permutation of a list in-place, and a function for random sampling without replacement.

]]>
106
seaborn https://aadatascience.com/2024/08/29/seaborn/ Thu, 29 Aug 2024 22:53:41 +0000 https://aadatascience.com/?p=111 Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

]]>
111
matplotlib https://aadatascience.com/2024/08/29/matplotlib/ Thu, 29 Aug 2024 22:53:30 +0000 https://aadatascience.com/?p=113 Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations.

]]>
113
sklearn https://aadatascience.com/2024/08/29/sklearn/ Thu, 29 Aug 2024 22:51:37 +0000 https://aadatascience.com/?p=115 Scikit-learn is an open source machine learning library that supports supervised and unsupervised learning. It also provides various tools for model fitting, data preprocessing, model selection, model evaluation, and many other utilities.

]]>
115
Pandas https://aadatascience.com/2024/08/29/pandas/ Thu, 29 Aug 2024 22:49:57 +0000 https://aadatascience.com/?p=102 pandas, a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language

]]>
102
numpy https://aadatascience.com/2024/08/29/numpy/ Thu, 29 Aug 2024 22:46:15 +0000 https://aadatascience.com/?p=104 NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.

]]>
104
Rate My Professor https://aadatascience.com/2024/08/29/rate-my-professor/ Thu, 29 Aug 2024 22:32:32 +0000 https://aadatascience.com/?p=98 The one thing I learned in college is to choose classes based on intel gathered from Ratemyprofessor.com. But I mostly focus on the negatives because they give you better insight into the person writing the review, not necessarily the professor. Bottomline, some of these negative reviews actually indicate a good professor, because an easy A may harm you in the long run. Also, a review where the professor requires the student to think more and write less is more engaging because they’re not spoon-feeding information. Most professors use prompts that make you deconstruct the sentence and that helps you understand how to engage your mind to know exactly what is desired for the outcome. IMO engaging your mind during each conversation and not assuming intent is a life long learning task.

]]>
98
SQL Tips and Tricks https://aadatascience.com/2024/08/28/sql-tips-and-tricks/ Wed, 28 Aug 2024 23:36:30 +0000 https://aadatascience.com/?p=84 USING

SELECT friend_id, e.name AS entree, d.name AS dessert
FROM entrees e
INNER JOIN desserts d USING (friend_id);

  • This is the basic anatomy of a typical select statement below.
  • The SELECT statement is used to select data from a database.
  • The INNER JOIN keyword selects records that have matching values in both tables.
  • The FROM command is used to specify which table to select or delete data from.
  • The AS command is used to rename a column or table with an alias.

USING is helpful for simplifying your join when you are joining two tables on columns with the same name. In the above example, you have two tables which are lists of entrees and desserts and the ID of the friend who knows how to prepare them. If you have offers from multiple friends to come over for dinner, you want to know what possible combinations of entree and dessert each friend can make for you before you decide which offer to accept, so you run this query. USING makes it so you do not have to write out ON e.friend_id = d.friend_id, but what I find particularly helpful is that you no longer have to qualify which friend_id you are referring to. This prevents the ever-frustrating error ERROR: column reference “friend_id” is ambiguous when you forget to put e. or d. in front of friend_id

CASE


“SELECT CASE
WHEN c.country = ‘US’ THEN c.state
ELSE c.country END AS region FROM clients c;”


CASE operates similarly to if, else if, and else statements. It returns what comes after THEN for the first WHEN statement that is true. If none of the WHEN statements are true, it returns what is under the ELSE statement. In the example, you want to send a gift to each of your clients because of how much you appreciate them, but first, you want to approximate shipping costs so you need to find out where they all live. If they don’t live in the United States, you’re fine with just calculating the shipping costs based on the country, but otherwise, you want to know which specific state they live in. CASE allows you to check the country condition and return the shipping region based off that check.

String Pattern Matching (LIKE/ILIKE/~/~*)


SELECT b.title, b.author
FROM books b
WHERE b.title LIKE ‘%Pirate%’;

SELECT b.title, b.author
FROM books b
WHERE b.title ~ ‘Pirate’;

SELECT b.title, b.author
FROM books b
WHERE b.title ILIKE ‘%pirate%’;

SELECT b.title, b.author
FROM books b
WHERE b.title ~* ‘pirate’;


If you need to pattern match on a string, you are provided with quite a few options. The most performant option available to you is LIKE, which uses the built-in SQL matching including % for 0 or more characters. You also have the Postgres-exclusive ~ which has the power of regex behind it, if you need a more complicated match. Then, you have the case insensitive versions of both, ILIKE and ~* respectively. For the example, we see four versions of trying to find a book that mentions “Pirate” in the title, since you want to read an old school high seas adventure. Personally, I have always loved the ~* for quick queries where I just need to find something in a table quickly and the performance is not much of an issue, but I would recommend using LIKE or ILIKE for production code if possible.

UNION (or UNION ALL)


SELECT s.pricing_id, s.price
FROM snacks s
UNION
SELECT t.pricing_id, t.price
FROM tickets t
UNION
SELECT m.pricing_id, m.price
FROM memberships m
ORDER BY price;


UNION allows you to combine the results of multiple queries into one result set. In the example, we have a theater which has three types of products that they store in separate tables due to the different information required for each product type. They have snacks from the snack bar, tickets for the shows, and memberships that allow you to support the theater while also getting discounted prices on other purchases. The theater wants a list of all the prices of their products along with the ID used by their POS system for record-keeping purposes. UNION allows them to take the results from all three tables and bring the distinct ones together. This works as long as each SELECT returns the same number of columns and they columns have similar types. If you are willing to not worry about making each row distinct, you can use UNION ALL instead.

FILTER


SELECT m.id as member_id, COUNT() as member_count, COUNT() FILTER(WHERE m.expiration_date > current_date) as active_member_count
FROM members_m;

The COUNT() function returns the number of rows that matches a specified criterion.


FILTER gives you the ability to run an aggregate function over a subset of the overall result set. Let’s go back to the theater from the previous example for this one. Now, they want to know how many total members they have ever had and how many active members they currently have. To get overall total, you can just run COUNT, but you can run COUNT again with the additional FILTER to only get the members you have not hit their expiration date yet.

There are a number of manuals I use, here is one I use the most:

W3 Schools

]]>
84