Home Blog Projects Tags About Friends

Data Analysis

Data analysis is the process of using data to discover, think about, and solve problems. In a narrower sense, in big tech companies, data analysis involves operating and managing a set of **metric systems** and, based on **business needs**, distilling a language centered on **data logic** that product and operations teams can understand. Foundations: Statistics, Python, SQL, mach...

· 5 min read

Data analysis, in essence, is the process of using data to discover, think about, and solve problems. In a narrower sense, within major tech companies, data analysis means operating and managing a metrics system, and addressing business requirements by extracting a business language centered on data logic that product and operations teams can understand.

  • Basic Skills: Statistics, Python, SQL, Machine Learning, Data Visualization. It is recommended to search for relevant courses on platforms like YouTube; deep learning can be treated as an elective.
  • Core Business Concepts: Event tracking, Metrics system, A/B testing, User persona, Feature engineering.
  • Recommended Books: Lean Analytics (can be skipped if time is limited).
  • Interview Reference: Data Analysis Interview Example

Overview

  • Focus on thinking, not just tools!
  • The key is not just to present the data, but to make the business stakeholders listen and understand!
  • Insist on hypothesis testing, rather than having preconceived notions!

![Pasted image 20250306110537.png](../image/数据分析/Pasted image 20250306110537.png)

==Personal weakness: Analytical methods==

Core Analysis Pipeline: Define the problem ——> Analyze and deconstruct ——> Identify the root cause ——> Solve the problem / Propose actionable suggestions (Prerequisite: Understand the business and the users).

Data Analysis Thinking and Process

I. Determine Business Objectives

Determine the general direction of analysis based on the product’s business stage (Introduction, Growth, Maturity, Decline). The OSM model (Objective, Strategy, Measurement) can be integrated to deconstruct the objectives.

II. Deconstruct the Business Metrics System

  • Core Metrics: The most critical metrics for measuring business success.
  • Result Metrics: Includes primary and secondary metrics, used to evaluate the final business outcomes.
  • Process Metrics: Used in conjunction with the funnel model. The secret to telling a good data story often lies in digging deep into process metrics.
  • Dimension Metrics: Deconstruction dimensions strongly related to the business, such as geographic location, user personas, etc.

III. Logical Thinking

![Pasted image 20250306114147.png](../image/数据分析/Pasted image 20250306114147.png)

  • Logical Reasoning: Deriving conclusions from scattered information.
    • Use methods such as induction, deduction (major and minor premises), and analogy to identify logical flaws from business stakeholders, thereby rejecting meaningless requests.
  • Structured Thinking: Maintain clear hierarchy and rigorous logical sequence.
    • Issue Tree: Break down large problems into smaller ones. Through the process of “Idea Collection -> Categorization -> Summarization and Supplementation,” quickly find excellent deconstruction dimensions (following the relative MECE principle). This method is time-consuming but excels in comprehensiveness.
    • Hypothesis Tree: Test and judge based on hypotheses.
    • Judgment Tree: Make judgments through branches, similar to the decision tree algorithm.
  • Systematic Thinking: Step outside the problem itself to view it.
    • Uncover the essence of the problem to redefine it.

IV. Business Implementation

![Pasted image 20250306114500.png](../image/数据分析/Pasted image 20250306114500.png)

The ultimate goal of all business actions is to influence user behavior! To understand user behavior, refer to the Fogg Behavior Model: ==Behavior = Motivation + Ability + Prompt==.

  • Interpret data from a business perspective: To tell a good data story, you must be user-centric (insight into what kind of users are thinking and doing what) and use data as evidence (avoid putting the cart before the horse).
  • Propose actionable suggestions: Specify concrete implementation actions, such as strengthening specific channels, optimizing product design, adjusting campaign timings, etc.

V. Relevant Scenarios

Scenario Analysis and Analytical Methods Reference: ByteDance Internal Data Analyst Training Video (Practical Skills + Case Studies)

1. Traffic Analysis

Channel Analysis

Usually divided into primary and secondary channels.

![Pasted image 20250306193325.png](../image/数据分析/Pasted image 20250306193325.png)

  • Key Metrics:
    • User Side: Number of valid users, Day 1 (Next-day) retention rate, Day 7 retention rate.
    • Channel Side: ROI (Return on Investment, built on the foundation of commercialization).
  • Analysis Dimensions: Structural analysis, Trend analysis, Comparative analysis, Fraud analysis.
    • Structural Analysis: Through funnel models or data drill-down, dig deep into “why the conversion rate at a specific stage is low.”

Feature Analysis

Feature Penetration Analysis ![Pasted image 20250306194219.png](../image/数据分析/Pasted image 20250306194219.png) Mainly evaluates how many users are using the feature and the intensity of user demand for it.

Feature Value Analysis

  1. Number of Core Feature Users: Define core features through metrics such as usage duration, usage frequency, and active days.
  2. Feature Contribution to Overall Retention: Feature A's contribution to overall retention = Feature A's penetration rate × Overall retention rate uplift brought by Feature A.
  3. Feature Revenue: Evaluate the direct or indirect commercial value brought by the feature.

Note: The growth trend of the overall user base may be inconsistent with the trend of core users, which requires analysis based on specific circumstances.

Traffic Fluctuation Analysis

  • DAU (Daily Active Users) Fluctuation: Investigate external factors such as channels, traffic entry points, and user personas, as well as internal factors like version updates, feature iterations, and marketing campaigns.
  • User Retention Fluctuation: Segment users into new and existing users (existing users can be further subdivided into core and non-core users). For fluctuations in existing user retention, it can be further deconstructed into the retention performance of specific features, thereby accurately pinpointing the main factors causing the data decline.
All Posts