- Linear Regression - To make predictions for sales forecast, price optimization, marketing optimization, financial risk assessment.
- Logistic Regression - To predict customer churn, to predict response versus advertisement spending, predict lifetime value of customer, and to monitor how business decisions affect predicted churn rates.
- Naive Bayes - Build spam detector, analyze customer sentiments, or automatically categorize products, customers or competitors.
- K-means clustering - Useful for cost modeling and customer segmentation
- Hierarchical clustering - Model business processes, or to segment customers based on survey responses, hierarchical clustering will probably come in handy.
- K-nearest neighbor classification - Type of instance based learning. use it for text document classification, financial distress prediction modeling, and competitor analysis and classification.
- Principal component analysis - Dimensionality reduction method that you can use for detecting fraud, for speech recognition, and for spam detection.
Unix Server ( Edge Node ) hangs when there are many jobs running on hadoop cluster started from Unix Edge Node.
When a unix server or an edge node is running lots of jobs (like Spark, Hadoop, or custom batch processes), crashes happen. For example. For example a process might hit a segementation fault, memory issue or ay other runtime issue. By default, if ulimit -c is not 0, the OS will create core dump. Core dump are written to disk and can be very large, sometimes hundreds of MBs or even GBs per process. What we realized was that when multiple processes crash at the same time, the system suddenly tries to write core files to disk. This was leading to DisK I/O spikes. Thus, node was becoming unresponsive. This was also leading CPU spike because OS was handling crash logging. Setting "ulimit -c 0" disables core dumps. This way we lose ability to debug crashes via core dump But, kept production edge nodes stable. On most Linux systems, by default, "core dumps" are written in current working directory of the process that crashes. Linux allows you to change core dump file nam...
Comments
Post a Comment