Posts

Showing posts from December 1, 2016

Sampling and sampling distributions

Sampling and sampling distributions In many cases, we would like to learn something about a big population, without actually inspecting every unit in that population. In that case, we would like to draw a sample that permits us to draw conclusions about a population of interest. We may, for example, draw a sample from the population of Dutch men of 18 years and older to learn something about the joint distribution of height and weight in this population. Because we cannot draw conclusions about the population from a sample without error, it is important to know how large these errors may be, and how often incorrect conclusions may occur. An objective assessment of these errors is only possible for a probability sample. For a  probability sample, the probability of inclusion in the sample is known and positive for each unit in the population. Drawing a probability sample of size n from a population consisting of N units may be a quite complex random experiment. The experi

Cross-Validation

Cross-Validation Cross-Validation is a resampling technique that is often used for model selection and estimation of the prediction error of a classification- or regression function. We have seen already that squared error is a natural measure of prediction error for regression functions: PE = E(y − ˆ f)2 Estimating prediction error on the same data used for model estimation tends to give downward-biased estimates because the parameter estimates are “fine-tuned” to the peculiarities of the sample. For very flexible methods, e.g. neural networks or tree-based models, the error on the training sample can usually be made close to zero. The true error of such a model will usually be much higher, however: the model has been “overfitted” to the training sample. One way of dealing with this problem is to include a penalty term for model complexity (e.g. AIC, BIC). An alternative is to divide the available data into a training sample and a test sample and to estimate the predict

Modern data analytic tools

Modern data analytic tools A classical approach to supervised classification methods was to combine and transform the raw measured variables to produce ‘ features’, defining a new data space in which the classes were linearly separable. This basic principle has been developed very substantially in the notion of support vector machines, which use some clever mathematics to permit the use of an effectively an infinite number of features. Early experience suggests that methods based on these ideas produce highly effective classification algorithms. Time series occupy a special place in data analysis because they are so ubiquitous. As a result of their importance, a wide variety of methods has been developed. Statistics and machine learning, the two legs on which modern intelligent data analysis stands, have differences in emphasis. One of these differences is the importance given to the interpretability of a model. For example, in both domains, recursive partitioning of tree 

Intelligent data analysis:

Intelligent data analysis: Intelligent Data Analysis provides a forum for the examination of issues related to the research and applications of Artificial Intelligence techniques in data analysis across a variety of disciplines. These techniques include (but are not limited to): all areas of data visualization, data pre-processing (fusion, editing, transformation, filtering, sampling), data engineering, database mining techniques, tools and applications, use of domain knowledge in data analysis, big data applications, evolutionary algorithms, machine learning, neural nets, fuzzy logic, statistical pattern recognition, knowledge filtering, and post-processing. In particular, papers are preferred that discuss the development of new AI-related data analysis architectures, methodologies, and techniques and their applications to various domains. Data simply comprise a collection of numerical values recording the magnitudes of various attributes of the objects under study. Then

Why Big data?

Why Big data? 1. Understanding and Targeting Customers This is one of the biggest and most publicized areas of big data use today. Here, big data is used to better understand customers and their behaviors and preferences. Companies are keen to expand their traditional data sets with social media data, browser logs as well as text analytics and sensor data to get a more complete picture of their customers. The big objective, in many cases, is to create predictive models. You might remember the example of U.S. retailer Target, who is now able to very accurately predict when one of their customers will expect a baby. Using big data, Telecom companies can now better predict customer churn; Wal-Mart can predict what products will sell, and car insurance companies understand how well their customers actually drive. Even government election campaigns can be optimized using big data analytics. 2. Understanding and Optimizing Business Processes Big data is also increasingly

The register Organization of 8086 processor

The 8086 has a powerful set of registers. It includes general purpose registers, segment registers, pointers and index registers and flag register. It is also known as Programmer’s model. General purpose registers:  The 8086 has Four 16 bit general purpose registers.  AX , BX, CX, DX  Each register split into two 8 bit registers  AL – Lower byte, AH- Higher byte. Segment registers: 16 bit address  Code segment  Data segment  Stack segment  Extra segment Pointers and index registers:  Base pointer  Index pointer  Source pointer  Source index  Destination index Flag Registers:  Carry Flag  Parity Flag  Auxiliary Flag  Zero Flag  Sign Flag  Overflow Flag

The 8086 Microprocessor

1. What do you mean by Addressing modes? The different ways that a microprocessor can access data are referred to as addressing modes. 2. What is meant by Vectored interrupt? When the external device interrupts the processor, processor has to execute interrupt service routine for servicing the interrupt. If the internal control circuit of the processor produces a CALL to a predefined memory location which is the starting address of interrupt service routine, then that address is called Vector address and such interrupts are called vector interrupts. 3. Name the hardware interrupts of 8086. 1. Divide by zero interrupt (Type 0) 2. Single step interrupt (Type 1) 3. Non Maskable interrupt (Type 2) 4. Breakpoint interrupt (Type 3) 5. Overflow interrupt ( Type 4) 4. What are called as assembler directives? Give two examples. There are some instructions in the assembly language which are not a part of processor instruction set. These instructions are instructions to the as

Lightweight Directory Access Protocol (LDAP)

Image
LDAP Once upon a time, in the dim and distant past (the late 70's - early 80's), the ITU (International Telecommunication Union) started work on the X.400 series of email standards. This email standard required a directory of names (and other information) that could be accessed across networks in a hierarchical fashion not dissimilar to DNS for those familiar with its architecture. This need for a global network based directory led the ITU to develop the X.500 series of standards and specifically X.519, which defined DAP (Directory Access Protocol), the protocol for accessing a networked directory service. The X.400 and X.500 series of standards came bundled with the whole OSI stack and were big, fat and consumed serious resources. Standard ITU stuff in fact. Fast forward to the early 90's and the IETF saw the need for access to global directory services (originally for many of the same email-based reasons as the ITU) but without picking up all the gruesome prot

DISTRIBUTED SYSTEMS

Image
Compare Synchronous and Asynchronous Communication. Synchronous data transfer: sender and receiver use the same clock signal supports high data transfer rate needs clock signal between the sender and the receiver requires master/slave configuration Asynchronous data transfer: sender provides a synchronization signal to the receiver before starting the transfer of each message does not need clock signal between the sender and the receiver o                   slower data transfer rate List out the uses of RMI registry. Essentially the RMI registry is a place for the server to register services it offers and a place for clients to query for those services. RMI Registry acts a broker between RMI servers and the clients. The server "registers" its services in the registry - hence a RMI Registry can act as a "directory" for many servers/services. The client does not need to know the location of individual servers, and does