From ac0e7766ede9636e9515e280e75adae14cbd20e2 Mon Sep 17 00:00:00 2001 From: miguel Date: Thu, 11 Jan 2024 09:52:03 +0000 Subject: [PATCH] fix(numpy): fix results and improve readme and audit --- subjects/ai/numpy/README.md | 12 ++++++------ subjects/ai/numpy/audit/README.md | 11 +++++------ 2 files changed, 11 insertions(+), 12 deletions(-) diff --git a/subjects/ai/numpy/README.md b/subjects/ai/numpy/README.md index 3d50e4336..86ed01c0e 100644 --- a/subjects/ai/numpy/README.md +++ b/subjects/ai/numpy/README.md @@ -13,7 +13,7 @@ I suggest to use the most recent one. ### Resources -- [Why Should We Use NumPy](https://medium.com/fintechexplained/)why-should-we-use-NumPy-c14a4fb03ee9 +- [Why Should We Use NumPy](https://medium.com/fintechexplained/why-should-we-use-NumPy-c14a4fb03ee9) - [NumPy Documentation](https://numpy.org/doc/) - [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/) @@ -183,14 +183,14 @@ The goal of this exercise is to learn to access values of n-dimensional arrays e [1, 0, 1, 0, 0, 0, 1, 0, 1], [1, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 0, 0, 0, 0, 0, 0, 1], - [1, 1, 1, 1, 1, 1, 1, 1, 1]], dtype=int8) + [1, 1, 1, 1, 1, 1, 1, 1, 1]], dtype=np.int8) ``` 3. Using **broadcasting** create an output matrix based on the following two arrays: ```python - array_1 = np.array([1,2,3,4,5], type=int8) - array_2 = np.array([1,2,3], dtype=int8) + array_1 = np.array([1,2,3,4,5], dytpe=np.int8) + array_2 = np.array([1,2,3], dytpe=np.int8) ``` Expected output: @@ -292,9 +292,9 @@ The goal of this exercise is to perform fundamental data analysis on real data u The dataset chosen for this task is the [red wine dataset](https://archive.ics.uci.edu/ml/datasets/wine+quality) -1. Load the data using `genfromtxt`, specifying the delimiter as ';', and optimize the numpy array size by reducing the data types. Ensure that the sum of absolute differences between the original and the "memory" optimized dataset is less than `1.10**-3`. Use `np.float32` and verify that the resulting numpy array weighs **76800 bytes**. +1. Load the data using `genfromtxt`, specifying the delimiter as ';', and optimize the numpy array size by reducing the data types. Use `np.float32` and verify that the resulting numpy array weighs **76800 bytes**. -2. Display the 2nd, 7th, and 12th rows as a two-dimensional array. +2. Display the 2nd, 7th, and 12th rows as a two-dimensional array. Exclude `np.nan` values if present. 3. Determine if there is any wine in the dataset with an alcohol percentage greater than 20%. Return True or False. diff --git a/subjects/ai/numpy/audit/README.md b/subjects/ai/numpy/audit/README.md index 980c535f1..d376650c8 100644 --- a/subjects/ai/numpy/audit/README.md +++ b/subjects/ai/numpy/audit/README.md @@ -300,12 +300,11 @@ Use this in the solution to confirm: ```Python -# Check the optimized data size and absolute differences +# Check the optimized data size optimized_size = optimized_data.nbytes -abs_diff = np.sum(np.abs(original_data - optimized_data)) -# To verify if criteria are met: -if abs_diff < 1.10**-3 and optimized_size <= 76800: +# Verify if the dataset size criterion is met +if optimized_size <= 76800: print("Data optimized successfully.") else: print("Optimization criteria not met.") @@ -313,6 +312,8 @@ else: ##### For question 2: +"Display the 2nd, 7th, and 12th rows as a two-dimensional array. Exclude `np.nan` values if present." + ###### Is the output the following? ```console @@ -324,8 +325,6 @@ else: 0.52 9.9 5. ]] ``` -This slicing gives the answer `data[[2,7,12],:]`. - ##### For question 3: "Determine if there is any wine in the dataset with an alcohol percentage greater than 20%. Return True or False."