Just something small and tiny.

Held by Ryan Ruoshui Yan.

Here is the world of a nerdy guy.


Artificial Intelligence Notes

Notes for CSC384

Disclaimer: The notes below are fully/partially NOT created by myself. They are from slides and/or wikipedia and/or textbook. The purpose of this post is simply to learn and review for the course. If you think something is inappropriate, please contact me at “ryan_yrs [at] hotmail [dot] com” immediately and I will remove it as soon as possible.


  • Math characterizations of rationality have come from diverse areas like logic and economics.
  • There is no universal agreement about which notions of rationality is best, but since these notions are precise we can study them and give exact characterizations of their properties.

AI Definitions

THINK Systems that think like humans Systems that think rationally
ACT Systems that act like humans Systems that act rationally
  • A computational method for capturing a particular version of this kind of reasoning.


  1. A STATE SPACE over which to search. State space necessarily involves ABSTRACTING real problem.
  2. An INITIAL STATE that best represents current state and DESIRED (or GOAL) CONDITION you want to achieve.
  3. ACTIONS (or SUCCESSOR FUNCTIONS) that allow move one from state to state. Actions are abstrctions of actions one could actually perform.
  4. Moving from state to state (taking an ACTION, advancing to a SUCCESSOR STATE) often incurs a COST.

Formalism Example 1

  1. Example
    • Drive from City A to B.
  2. State Space
    • Cities where one could be located.
  3. Action
    • Driving will advance you from one city to next.
  4. Initial State
    • In city A.
  5. Desired Condition
    • Be in a state where one is in B.
  6. Cost
    • Could be travel time btw cities.
  7. Solution
    • A sequence of cities to travel thru to get to B.

Formalism Example 2

  1. Example
    • 3 gallon jug (gal3)
    • 4 gallon jug (gal4)
    • Can fill either jug to top from a top
    • Can empty either jug
    • Can pour one jug into other
  2. State Space
    • Pairs of numbers where gal3 is # gallons in 3 gallon jug.
    • gal4 is # of gallons in 4 gallon jug.
  3. Action
    • Empty-3-Gallon
    • Empty-4-Gallon
    • Fill-3-Gallon
    • Fill-4-Gallon
    • Pour-3-Into-4
    • Pour-4-Into-3
  4. Initial State
    • Various
    • e.g.
      • (0,0)
  5. Desired Condition
    • Various
    • e.g.
      • (0,2)
      • (*,3) where * can be anything
  6. Cost
    • Various
    • Each action may incur a uniform cost.
  7. Solution
    • Various

Formalism Example 3

  1. Example
    • 8-Puzzle
  2. State Space
    • Diff configs of tiles (9! possibilities).
  3. Action
    • Move blank up, down, left, right.
  4. Initial State
    • 724/5_6/831
  5. Desired Condition
    • A state where tiles are in positions shown like 123/456/78_
  6. Cost
    • Vavious
    • Each action may incur a uniform cost.
  7. Solution
    • A sequence of moves of blank that transform initial state to a goal state.

Successor Function

  • S(x)
  • Yields a set of states that can be reached from x via any single action.
    • S(x) = {<y,a>, <z,b>}
      • Arrive at y via action a
      • Arrive at z via action b
    • S(x) = {<y,a>, <y,b>}
      • Arrive at y via action a
      • Arrive at y via action b alternatively
    • May also be important to reference parents of annotated.
  • Completeness
    • Will search always find a solution if a solution exists?
  • Optimality
    • Will search always find least cost solution?
  • Time Complexity
    • What is max # nodes (paths) than can be EXPANDED or GENERATED?
  • Space Complexity
    • What is max # nodes (paths) that have to be STORED in memory?

Branching Factor 分支因子

  • The number of children at each node, the outdegree. If this value is not uniform, an average branching factor can be calculated.
  • 每個結點下的子結點數,即出度。如果各個結點分支因子不同,則可以計算平均分支因子。
  • Dijkstra’s algorithm or a variant of it usd in AI.


See notes for CSC263.

  • Short: IDS (Iterative Deepening Search)
  • A state space/graph search strategy in which a depth-limited version of depth-first search is run repeatedly with increasing depth limits until the goal is found. IDDFS is equivalent to breadth-first search, but uses much less memory; on each iteration, it visits the nodes in the search tree in the same order as depth-first search, but the cumulative order in which nodes are first visited is effectively breadth-first.
  • 一個狀態空間(狀態圖)搜索策略。在這個搜索策略中,一個具有深度限制的深度優先搜索算法會不斷重複地運行,並且同時放寬對於搜索深度的限制,直到找到目標狀態。IDDFS 與廣度優先算法是等價的,但對內存的使用會少很多;在每一步迭代中,它會按深度優先算法中的順序,遍歷搜索樹中的節點,但第一次訪問節點的累積順序實際上是廣度優先的。
  • Depth-Limited DFS

Heuristic Technique 啟發法

  • Any approach to problem solving, learning, or discovery that employs a practical method not guaranteed to be optimal or perfect, but sufficient for the immediate goals.
  • 依據有限的知識在短時間內找到問題解決方案的一種技術。它是一種依據關於系統的有限認知和假說從而得到關於此系統的結論的分析行為。由此得到的解決方案有可能會偏離最佳方案。通過與最佳方案的對比,可以確保啟發法的質量。典型的啟發法有試錯法和排除法。

Heuristic Function

  • We can encode each notion of merit of a state into a heuristic function, h(n).
  • A heuristic function maps a state onto an estimate of cost to goal from that state.
  • A search algorithm which explores a graph by expanding the most promising node chosen according to a specified rule.

Admissible Heuristic

  • Never over-estimates cost to reach goal.
  • It is optimistic

Manhattan Distance 曼哈頓距離

  • A form of geometry in which the usual distance function of metric or Euclidean geometry is replaced by a new metric in which the distance between two points is the sum of the absolute differences of their Cartesian coordinates.
  • 標明兩個點上在標準坐標系上的絕對軸距之總和。


  • Stronger condition than global admissibility.
  • A MONOTONE heuristic satisfies condition.
    • h(n1) <= c(n1, a, n2) + h(n2)
  • Note that there might more than one transition (action) that joins n1 and n2, and inequality must hold for all of them.
  • If h(n) is admissible and monotonic, search will be both optimal and not locally mislead.

IDA* (Iterative Deepening A*)

  • A graph traversal and path search algorithm that can find the shortest path between a designated start node and any member of a set of goal nodes in a weighted graph.

Game Properties

  • Two-Player
    • Games that are for two.
  • Discrete
    • Games states or desicions can be mapped on discrete values.
  • Finite
    • There are only a finite number of states and possible decisions that can be made.
  • Zero-Sum
    • Fully competitive: If one player wins, other loses an equal amount.
  • Deterministic
    • No change involved: No dice, or random deals of cards, or coin flips, etc.
  • Perfect Information
    • All aspects of state are fully observable. e.g. No hidden cards.

Strategic/Normal Form Game

  • Single move each
  • “One shot” game

Entensive Form Game

  • Turn-taking: players act alternatively
  • e.g. chess, checkers, etc.
  • Extend over multiple moves

Two-Player Zero-Sum Game

  • Two PLAYERS A (max) and B (min)
  • Set of STATES S (a finite set of states of game)
  • An INITIAL STATE I \in S (where game begins)
  • TERMINAL POSITIONS T \BELONGSTO P (Terminal state of game: states where game is over)
  • Successors (or Succs: A fn that takes a state as input and returns a set of psb next states to whomever is due to move)
  • Utility (or Payoff Function) V: T -> R (A mapping from terminal states to real numbers that show good is each terminal state for player A, and bad for player B)

Game Tree

Nim 尼姆遊戲

  • Nim is a mathematical game of strategy in which two players take turns removing objects from distinct heaps. On each turn, a player must remove at least one object, and may remove any number of objects provided they all come from the same heap. The goal of the game is to be the player to remove the last object.
  • 一種兩個人玩的回合制數學戰略遊戲。遊戲者輪流從一堆棋子(或者任何道具)中取走一個或者多個,最後不能再取的就是輸家。

Minimax 極小化極大算法

  • A decision rule for minimizing the possible loss for a worst case scenario.
  • 一種找出失敗的最大可能性中的最小值的算法。
  • Depth-First Implementation of Minimax
  • Pruning (Cut)
    • Pruning of max nodes (α-cut)
    • Pruning of min nodes (β-cut)

Evaluation Function

A function used by game-playing programs to estimate the value or goodness of a position in the minimax and related algorithms.

Feature Vector

CSP (Constraint Satisfaction Problem 約束補償問題)

IBM Model

  • IBM Model 1: Lexical translation
  • IBM Model 2: Adds absolute RE-ORDERING MODEL

Waltz Algorithm

Type of Constraints

  • Unary: 1 var
    • C(X): X=2
    • C(Y): Y>5
  • Binary: 2 vars
    • C(X,Y): X+Y<6
  • Higher-Order: >= 3 vars

Eight Queens Puzzle 八皇后問題

  • The problem of placing eight chess queens on an 8×8 chessboard so that no two queens threaten each other. Thus, a solution requires that no two queens share the same row, column, or diagonal.
  • 如何能夠在8×8的國際象棋棋盤上放置八個皇后,使得任何一個皇后都無法直接吃掉其他的皇后?為了達到此目的,任兩個皇后都不能處於同一條橫行、縱行或斜線上。

Forward Checking

Keep track of remaining legal values for unassigned vars.

MRV (Minimum Remaining Values)

  • Not Most Restricted Variable
  • Always branch on a var with SMALLEST REMAINING VALUES (Smallest CurDom).
  • If a var has only one value left, that value is forced, so we should propagate its consequences immediately.
  • This heuristic tends to produce skinny trees at top. This means that more vars can be instantiated with fewer nodes searched, and thus more constraint propagation/DWO failures occur when tree starts to branch out. We start selecting vars with larger domains.
  • We can find a inconsistency much faster.

GAC (Generalized Arc Consistency)

Joint Distribution

Records probs that vars will hold particular values.


Converting raw counts of data into a legal prob dtbtn.

Bayesian Network

  • A GRAPHICAL REPRESENTATION of direct dependencies over a set of vars, together with a set of CONDITIONAL PROBABILITY TABLES quantifying strength of those influences.

Variable Elimination


  • Has vertices just like an ordinary graph, but instead of edges btw 2 vertices X <-> Y it contains HYPEREDGES

Elimination Width


KR (Knowledge Representation)

Symbolic encoding of propositions believed by some agent.


Manipulation of symbols encoding propositions to produce new representations of new propositions.

Knowledge Representation Hypothesis

Cognitive Penetrability

  • adj. “Cognitively penetrable”

FOL (First-Order Logic)

  • Contains
    • Object (which is constant)
      • student
      • subject
      • assignment
      • number
    • Predicate
      • difficult(subject)
      • CSMajor(student)
    • Relation (which is predicate)
      • handedIn(student, assignment)
    • Function
      • Grade(student, assignment) -> number
  • Has
    • Syntax
      • A grammar specifying what are legal syntactic constructs of representation.
    • Semantics
      • A formal mapping from syntactic constructs to set theoretic assertions.
  • Term
    • A var, or
    • A const, or
    • An expression of form f(t_1, …, t_k) where:
      1. f is a FUNCTION symbol
      2. k is its arity
      3. Each t_i is a TERM
  • Atom
    • Denotes facts that can be TRUE/FALSE about world.
    • An expression of form p(t_1, …, t_k) where:
      1. p is a PREDICATE symbol
      2. k is its arity
      3. each t_i is a TERM
    • Note: Consts are same as fns taking 0 arguments.
    • UPPERCASE for vars, lowercase for fn/const/pred symbols.

KB (Knowledge Base)

Entailment (⊨)

  • α ⊨ β
    • α entails β
    • β follows logically from α

Syntactical Derivation (⊢)

  • α ⊢ β
    • α derives β


  • KB ⊢ f -> KB ⊨ f
  • i.e. All conclusions arrived at via proof procedure are correct: they are logical consequences.


  • KB ⊨ f -> KB ⊢ f
  • i.e. Every logical consequence can be generated by proof procedure.

Interpretation (Model)

  • A tuple <D, Φ, Ψ, v>
    • D: A non-empty set (domain of individuals)
    • Φ: A mapping: Φ(f) -> (D^k->D)
      • Maps k-ary fn symbol f, to a fn from k-ary tuples of individuals to individuals.
    • Ψ: A mapping: Ψ(p) -> (D^k -> True/False)
      • Maps k-ary pred symbol p, to an indicator fn over k-ary tuples of individuals (a subset of D^k).
    • v: A var assignment fn. v(X) = d \in D
      • It maps every var to some individual.

Domain Axiomatization

  • More sentences there are in KB, fewer models (satisfying interpretations) there are.
  • More one writes down, closer one gets to world. Because each sentence in KB rules out unintended interpretations.

Resolution Proof





MGU (Most General Unifier)



Haskell Notes

Disclaimer: The notes below are fully/partially NOT created by myself. They are from slides and/or wikipedia and/or textbook. The purpose of this post is simply to learn and review for the course. If you think something is inappropriate, please co...…


Neural Network & Machine Learning Notes

Notes for CSC321Disclaimer: The notes below are fully/partially NOT created by myself. They are from slides and/or wikipedia and/or textbook. The purpose of this post is simply to learn and review for the course. If you think something is inapprop...…