2.
-------- is the most popular open-source Python library used for doing data analysis.
Explanation
Pandas is the most popular open-source Python library used for data analysis. It provides powerful data structures and data analysis tools, making it easier to manipulate, analyze, and visualize data. With its intuitive and flexible API, Pandas allows users to handle large datasets efficiently and perform various operations such as filtering, grouping, and merging data. It also offers support for time series data and missing data handling. Overall, Pandas is widely used in the data science community for its versatility and ease of use in data analysis tasks.
3.
In order to work with Pandas in Python, you need to import pandas library in your Python environment.
Explanation
To work with Pandas in Python, it is necessary to import the pandas library. This library provides various data structures and functions for data manipulation and analysis. By importing the pandas library, you can access its functionalities and use them in your Python environment. Therefore, the correct answer is True.
4.
A Series is a Pandas data structure that represents a one-dimensional array-like object of indexed data.
Explanation
A Series in Pandas is indeed a one-dimensional array-like object of indexed data. It is similar to a column in a spreadsheet or a database table. The data in a Series is organized in a linear manner and can be accessed using the index labels. Therefore, the given statement is correct.
5.
The two basic and universally-popular data structures of Pandas are ........... and ............
Correct Answer
A. Series DataFrame
Explanation
Pandas is a popular data manipulation library in Python, and it provides two basic data structures: Series and DataFrame. A Series is a one-dimensional array-like object that can hold any data type, while a DataFrame is a two-dimensional table-like data structure with labeled axes (rows and columns). These two data structures are widely used in Pandas for various data analysis and manipulation tasks.
6.
To create a series object, ------- method is used.
Correct Answer
A. Series()
Explanation
The correct answer is "Series()". This method is used to create a series object in Python. A series object is a one-dimensional labeled array that can hold any data type. It is similar to a column in a spreadsheet or a SQL table. The Series() method allows you to pass in data, such as a list or array, and optionally specify labels for the index.
7.
To create an empty series object, Series() method is used with ---------.
Correct Answer
No parameter, no arguments
Explanation
To create an empty Series object, the Series() method is used with no arguments or you can explicitly pass in None or an empty list [].
Example: pd.Series() or pd.Series([])
8.
Missing data in Pandas series and data frames can be filled with a ........... value.
Correct Answer
NaN
Explanation
Missing data in Pandas series and data frames can be filled with a NaN value. NaN stands for "Not a Number" and is a special value in Pandas that represents missing or undefined data. When filling missing values with NaN, it allows for easier identification and handling of missing data in data analysis tasks.
9.
In pandas, the ______ accessor is used for label-based indexing, allowing you to select rows and columns by their labels.
Correct Answer
loc
Explanation
The loc accessor provides a powerful and intuitive way to access and manipulate data within a pandas DataFrame. It enables you to select specific rows and columns using their labels, making data retrieval and manipulation more readable and less error-prone compared to index-based selection.
10.
Selecting a subset from a data frame requires ----- and -----functions.
Correct Answer
loc,iloc
Explanation
To select a subset from a data frame, the loc and iloc functions are used. The loc function is used to select rows and columns by label, while the iloc function is used to select rows and columns by index. These functions provide a convenient way to extract specific data from a data frame based on the desired labels or indices.
11.
............ method helps in understanding and analyzing the behaviour of data.
Correct Answer
Statistical
Explanation
Statistical method helps in understanding and analyzing the behavior of data by using various statistical techniques such as data collection, organization, analysis, interpretation, and presentation. It involves the use of statistical measures, models, and tools to summarize and make inferences about the data. By applying statistical methods, patterns, trends, and relationships within the data can be identified, enabling researchers to draw meaningful conclusions and make data-driven decisions.
12.
----- is the process of turning the value of a dataset ( or a subset of it)into one single value.
Correct Answer
Data Aggregation
Explanation
Data aggregation is the process of combining multiple values or data points into a single value. It involves summarizing or condensing the information in a dataset or a subset of it. This can be done by applying mathematical operations such as sum, average, count, or by grouping data based on certain criteria. The result is a single value that represents the collective information of the dataset, making it easier to analyze and interpret the data.
13.
................ function is used for finding the highest values from a given set of values or column of a dataframe or a series.
Correct Answer
max()
Explanation
The max() function is a common tool in data analysis and programming for identifying the highest value within a dataset. It can be applied to various data structures, such as arrays, lists, or dataframes, to efficiently extract the maximum value, simplifying data analysis and decision-making.
14.
.......... function is used to add all of the values in a particular column of a dataframe.
Correct Answer
sum()
Explanation
The sum() function is used to add all of the values in a particular column of a dataframe. It takes the column as input and returns the sum of all the values in that column. This function is commonly used in data analysis and manipulation tasks to calculate the total of a specific column.
15.
Passing -------- argument skips the missing values by default.
Correct Answer
skipna=True
Explanation
When the argument "skipna" is set to True, it means that missing values will be skipped by default. This means that if there are any missing values in the data, they will not be included in the calculations or operations performed. Instead, the calculations will be done only on the non-missing values.
16.
............ function calculates the most occurance of each element along the axis selected.
Correct Answer
Mode()
Explanation
The Mode() function calculates the most frequent occurrence of each element along the selected axis.
17.
A sereis object is 2D array that stores ordered collection of columns that can store data of different tuple.
Correct Answer
False
Explanation
The given statement is false. A series object is not a 2D array, but a one-dimensional labeled array that can hold any data type. It is similar to a column in a spreadsheet or a SQL table. Each element in a series object is associated with a unique label or index, which allows for easy data manipulation and analysis.
18.
A data frame is a 1D array like object containing an array of data and an associated array of data labels.
Explanation
A data frame is not a 1D array like object, but rather a 2D table-like structure that contains rows and columns of data. It is commonly used in data analysis and manipulation, as it allows for easy organization and manipulation of data.
19.
To access subset of a dataframe we can use loc() method.
Explanation
The statement is true because the loc() method is used to access subsets of a dataframe in pandas. It allows us to select rows and columns based on labels or boolean conditions. This method is particularly useful when working with large datasets and needing to extract specific data for analysis or manipulation.
20.
The value NaN/NAT/None are the same in Pandas.
Explanation
In Pandas, NaN (Not a Number), NAT (Not a Time), and None are all considered as missing or null values. Therefore, they are treated as the same value in Pandas. This means that operations and comparisons involving NaN, NAT, or None will yield the same results. Hence, the statement "The value NaN/NAT/None are the same in Pandas" is true.
21.
The all() and any() functions are used to check if all or any item is non zero, not-empty, or not false.
Explanation
The statement is true because the all() function returns True if all items in an iterable are true, while the any() function returns True if at least one item in an iterable is true. These functions are commonly used to check the truthiness of multiple values in a concise and efficient way.
22.
Which of the following commands is used to install Pandas?
Correct Answer
A. Pip install pandas
Explanation
The correct answer is "pip install pandas". This command is used to install the Pandas library in Python. "pip" is a package manager for Python that allows users to easily install and manage software packages. By using the "install" command followed by the package name "pandas", the user can download and install the Pandas library onto their system.
23.
A two-dimensional labelled array that is ordered collection of columns to store heterogeneous data type is:
Correct Answer
A. Data Frame
Explanation
A two-dimensional labelled array that is ordered collection of columns to store heterogeneous data type is called a Data Frame. Data Frames are commonly used in data analysis and manipulation tasks in Python, particularly in the pandas library. They provide a convenient way to organize and manipulate data, allowing for easy indexing, filtering, and aggregation operations.
24.
In a data frame axis -0 is for
Explanation
In a data frame, axis -0 refers to the rows. The axis parameter in pandas allows us to specify whether we want to perform an operation along the rows or columns of a data frame. By setting axis = 0, we indicate that we want to operate along the rows. Therefore, the correct answer is rows.
25.
Which attribute of dataframe is used to perform the transpose operation on a dataframe?
Explanation
The attribute "T" is used to perform the transpose operation on a dataframe. This attribute returns the transpose of the dataframe, which means it switches the rows and columns of the dataframe.