Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî ïðèìåíåíèÿ â çàäà÷àõ ïðèíÿòè ðåøåíèé Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ Èíñòèòóò êèáåðíåòèêè èì. Â.Ì.Ãëóøêîâà, Óíèâåðñèòåò Êàëèôîðíèè (Äåâèñ) Êèåâ, 11 èþíÿ 2013 Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî ïðèìåíåíèÿ Ñîäåðæàíèå Ââåäåíèå Íåêîòîðûå ñâÿçè ìåæäó ñõîäèìîñòüþ ôóíêöèé è ìèíèìóìîâ Ãðàôè÷åñêàÿ ñõîäèìîñòü è åå ñëåäñòâèÿ Çàäà÷è ïðèíÿòèÿ ðåøåíèé: ÿçûê âêëþ÷åíèé Ñòîõàñòè÷åñêèå âêëþ÷åíèÿ Çàêîíû áîëüøèõ ÷èñåë Ãðàôè÷åñêèé çàêîí áîëüøèõ ÷èñåë Îáñóæäåíèå ãðàôè÷åñêîãî çàêîíà áîëüøèõ ÷èñåë Çàäà÷è ñòîõàñòè÷åñêîãî ïðîãðàììèðîâàíèÿ Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî ïðèìåíåíèÿ Ââåäåíèå Êàê ñâÿçàíû ñòàòèñòè÷åñêàÿ òåîðèÿ îáó÷åíèÿ è òåîðèÿ ñòîõàñòè÷åñêîãî ïðîãðàììèðîâàíèÿ? (ìíîãî îáùåãî, íî íå âçàèìîäåéñòâóþò). Êàê ñâÿçàíû çàäà÷è ïðèíÿòèÿ ðåøåíèé (îïòèìèçàöèÿ) è Çàêîí Áîëüøèõ ×èñåë (ÇÁ×) (òåîðèÿ âåðîÿòíîñòè)? (ÇÁ× èñïîëüçóåòñÿ äëÿ àïïðîêñèìàöèè ôóíêöèé-ìàòåìàòè÷åñêèõ îæèäàíèé). ×òî îçíà÷àåò ÃÐÀÔÈ×ÅÑÊÈÉ Çàêîí Áîëüøèõ ×èñåë? (îáîáùåíèå ðàâíîìåðíîãî ÇÁ× íà ìíîãîçíà÷íûå ôóíêöèè). ×òî íîâîãî ïðèâíîñèò ãðàôè÷åñêèé çàêîí áîëüøèõ ÷èñåë? (Ïîçâîëÿåò àïïðîêñèìèðîâàòü çàäà÷è ñ ðàçðûâàìè). Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî ïðèìåíåíèÿ Íåêîòîðûå ñâÿçè ìåæäó ñõîäèìîñòüþ ôóíêöèé è ìèíèìóìîâ Îñíîâíàÿ òåîðåìà (ñòàòèñòè÷åñêîé) òåîðèè îáó÷åíèÿ I: =⇒ (b) Uniform one-sided ∀c : inf {R(x)≥c} Rn (x) → inf {R(x)≥c} R(x) ≥ c (b) limn supx (R(x) − Rn (x)) ≤ 0 (c) limn Rn (x) = R(x) (pointwise) (a) Consistency by Vapnik conv. (a) Îñíîâíàÿ òåîðåìà (ñòàòèñòè÷åñêîé) òåîðèè îáó÷åíèÿ II: Consistency by Vapnik ⇐= (b) Uniform one-sided conv. +(c) pointwise convegence ⇓ Âàæíàÿ òåîðåìà âàðèàöèîííîãî àíàëèçà, Rockafellar and Wets (1998): ⇐= (e) Epi-graphical (d) inf x∈X Rn (x) → inf x∈X R(x). (e) limn (R(x) − Rn (xn )) ≤ 0 ∀ xn → x, limn (R(x) − Rn (xn )) = 0 for some xn → x. (d) Convergence of innums Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ convergence Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî ïðèìåíåíèÿ Ãðàôè÷åñêàÿ ñõîäèìîñòü è åå ñëåäñòâèÿ, Rockafellar and Wets (1998) Ãðàôè÷åñêàÿ ñõîäèìîñòü ôóíêöèé è îòîáðàæåíèé ýòî ñõîäèìîñòü ãðàôèêîâ gphFn → gphF g Fn → F (êàê ìíîæåñòâ) g Fn → F : (a) ∀xn → x ∈ X , Fnk (xnk ) 3 ynk → y ⇒ y ∈ F (x), (b) ∀x ∈ X , y ∈ F (x) ñóùåñòâóåò xn → x òàêàÿ, ÷òî limn F (xn ) = y . Ãðàôè÷åñêàÿ ñõîäèìîñòü Òåîðåìà: Ïóñòü ôóíêöèÿ è îòîáðàæåíèÿ g ,p Gn → G F : X → IR ïîëóíåïðåðûâíà ñíèçó ñõîäÿòñÿ ïîòî÷å÷íî è ãðàôè÷åñêè, òîãäà ñõîäÿòñÿ è ìèíèìóìû: inf {x∈X : ~0∈Gn (x)} F (x) → Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ inf {x∈X : ~0∈G (x)} F (x) Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî ïðèìåíåíèÿ Çàäà÷è ïðèíÿòèÿ ðåøåíèé: ÿçûê âêëþ÷åíèé Âêëþ÷åíèÿ (îáîáùåííûå óðàâíåíèÿ): íàéòè x ∈ X : ~0 ∈ S(x) ìíîãîçíà÷íîå îòîáðàæåíèå. Âêëþ÷åíèÿ ÿâëÿþòñÿ îáîáùåíèÿìè (ñèñòåì) óðàâíåíèé: ~0 = S(x) îäíîçíà÷íîå îòîáðàæåíèå (âåêòîð-ôóíêöèÿ). Âêëþ÷åíèÿ ÿâëÿþòñÿ îáîáùåíèìè (ñèñòåì) íåðàâåíñòâ: {x ∈ X : ~f (x) ≤ ~0} ⇐⇒ {x ∈ X : ~0 ∈ ~f (x) + IR m }. | {z +} S(x) Çàäà÷è îïòèìèçàöèè (ñ âêëþ÷åíèÿìè): F (x ∗ ) = minx∈X F (x), F (x ∗ ) = min{x∈X : ~0∈S(x)} F (x). Íåîáõîäèìûå óñëîâèÿ ýêñòðåìóìà â âèäå âêëþ÷åíèé: ~0 ∈ ∂F (x) ñóáäèôôåðåíöèàë; ~0 ∈ ∂F (x) + NX (x) êîíóñ íîðìàëåé. Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî ïðèìåíåíèÿ Ñòîõàñòè÷åñêèå âêëþ÷åíèÿ y ∈ Y: X (y ) = {x ∈ X : 0 ∈ S(x, y )} Âêëþ÷åíèÿ ñ ïàðàìåòðîì Ñòîõàñòè÷åñêèå âêëþ÷åíèÿ (y ñëó÷àéíàÿ X ∗ = {x ∈ X : 0 ∈ ES(x) := Ey S(x, y )} íå òî æå, ÷òî {x ∈ X : 0 ∈ S(x, Eξ y )}. âåëè÷èíà): Àïïðîêñèìàöèÿ ìåòîäîì Ìîíòå-Êàðëî: X n = {x ∈ X : 0 ∈ S n (x) := 1 n Pn i=1 S(x, yi )}. n Âîïðîñ: ñõîäèìîñòü ìíîæåñòâ ðåøåíèé: X → X ∗ ??? Îòâåò: ñõîäèìîñòü ðåøåíèé ñëåäóåò èç ãðàôè÷åñêîé ò.å. (íå g S n → ES , n èç ñõîäèìîñòè ãðàôèêîâ gphS → gphES , n òî æå, ÷òî ïîòî÷å÷íàÿ ñõîäèìîñòü S (x) → ES(x)). ñõîäèìîñòè àïïðîêñèìàöèé: Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî ïðèìåíåíèÿ Çàêîíû áîëüøèõ ÷èñåë ( ÇÁ×) ÇÁ× äëÿ íåçàâèñèìûõ ñëó÷àéíûõ âåëè÷èí (Êîëìîãîðîâ): F n := 1 n Pn i=1 f (yi ) → Ey f (y ) (ñ âåð. 1). Ðàâíîìåðíûé ÇÁ× (Ãëèâåíêî-Êàíòåëëè): P supx∈X n1 ni=1 f (x, yi ) → Ey f (x, y ) → 0 ñ âåð. 1. (íå òî æå, ÷òî ïîòî÷å÷íûé çàêîí áîëüøèõ ÷èñåë: 1 n Pn i=1 f (x̄, yi ) → Ey f (x̄, y ) äëÿ êàæäîãî x̄ ñ âåð. 1). Ðàâíîìåðíûé îäíîñòîðîííèé çàêîí áîëüøèõ ÷èñåë Âàïíèêà-×åðâîíåíêèñà Òåîðåìû î ñêîðîñòè êîíöåíòðàöèè â ÇÁ× ÇÁ× äëÿ ñëó÷àéíûõ ìíîæåñòâ (Artstein and Vitale (1975), Artstein and Hart (1981)): Ýïè-ãðàôè÷åñêèé ÇÁ× äëÿ ôóíêöèé (Âåòñ è äð. (1988-1996)) Ðàâíîìåðíûé ÇÁ× äëÿ ñëó÷àéíûõ îòîáðàæåíèé (Molchanov (1999), Teran(2008)), ïñåâäîðàâíîìåðíûé ÇÁ× (Shapiro and Xu (2007)) Ãðàôè÷åñêèé çàêîí áîëüøèõ ÷èñåë Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî (äëÿ ñëó÷àéíûõ îòîáðàæåíèé, Norkin and Wets (2013)): Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ ïðèìåíåíèÿ Ãðàôè÷åñêèé çàêîí áîëüøèõ ÷èñåë äëÿ ìíîãîçíà÷íûõ îòîáðàæåíèé Òåîðåìà (Norkin and Wets (2013)). Ïóñòü ñëó÷àéíûå âåëè÷èíû {ξ, ξ1 , ξ2 , ...} è îòîáðàæåíèå G : X × IR l → 2RI m óäîâëåòâîðÿþò óñëîâèÿìè: {ξ, ξi } íåçàâèñèìû è îäèíàêîâî ðàñïðåäåëåíû; (b) G ïîëóíåïðåðûâíî ñâåðõó ïî x , (c) G èìåðèìî ïî y , (d) supg ∈G (x,ξ) kg k ≤ K (ξ) èíòåãðèðóåìà. (à) Òîãäà åäèíèöà ñ âåðîÿòíîñòüþ n 1X G (·, ξi ) gph n i=1 {z } | → gph Eξ conv {G (·, ξ)} {z } | . Integrals of all selections Sum by elements Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî ïðèìåíåíèÿ Îáñóæäåíèå ãðàôè÷åñêîãî çàêîíà áîëüøèõ ÷èñåë Ïîòî÷å÷íûé ÇÁ× (Artstein and Vitale (1975), Artstein and Hart (1981)) gph 1 Pn n i=1 G (x̄, ξi ) → gph {Eξ conv {G (x̄, ξ)}}. Âàïíèê, ×åâîíåíêèñ (1971, 1981): íåîáõîäèìûå è äîñòàòî÷íûå óñëîâèÿ äëÿ ðàâíîìåðíîãî ÇÁ× EntropyX (n)/n → 0, (äëÿ èíäèêàòîðíûõ EpsilonEntropyXε (n)/n → 0, supx∈X |f (x, ξ)| ≤ K (ξ) èíòåãðèðóåìà ôóíêöèé) Jennrich (1969): Ðàâíîìåðíûé çàêîí áîëüøèõ ÷èñåë (äîñòàòî÷íûå óñëîâèÿ) f (·, ξ) íåïðåðûâíà íà êîìïàêòå X , supx∈X |f (x, ξ)| ≤ K (ξ) èíòåãðèðóåìà Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî ïðèìåíåíèÿ Çàäà÷à ñòîõàñòè÷åñêîé îïòèìèçàöèè Çàäà÷à îïòèìèçàöèè: F (x ∗ ) = minx∈X F (x) Çàäà÷à îïòèìèçàöèè ñ (ñëó÷àéíûì) ïàðàìåòðîì y: F (x ∗ (y ), y ) = minx∈X F (x, y ) Çàäà÷à ñòîõàñòè÷åñêîãî ïðîãðàììèðîâàíèÿ (ìèíèìèçàöèÿ îæèäàåìîãî ðèñêà): F (x ∗ ) = minx∈X [F (x) = Ey F (x, y )] Íå òî æå ñàìîå, ÷òî F (x ∗∗ , Ey ) = minx∈X F (x, Ey ) Ñðåäíèé óùåðá (ðèñê) y íå Ey íàâîäíåíèé íàâîäíåíèÿ Ey F (x, y ) îò âñåõ âîçìîæíûõ òî æå, ÷òî óùåðá îò îäíîãî ñðåäíåãî Èòàê, êàê ðåøàòü? Ïðîáëåìà ñ âû÷èñëåíèåì ñðåäíåãî Ey F (x, y ) Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî ïðèìåíåíèÿ Ýìïèðè÷åñêàÿ àïïðîêñèìàöèÿ çàäà÷ ñòîõàñòè÷åñêîãî ïðîãðàììèðîâàíèÿ Àïïðîêñèìèðóåì òåîðåòè÷åñêîå ñðåäíåå (ðèñê) ýìèïèðè÷åñêèì F n (x) = 1 n Pn Ey F (x, y ) i=1 F (x, yi ). Ïðèõîäèì ê ïðèáëèæåííîé çàäà÷å F n (x n ) = minx∈X F n (x) = Ñõîäèìîñòü: Åñëè ñõîäÿòñÿ ðåøåíèÿ 1 n Pn i=1 F (x, yi ) F n (x) =⇒ F (x) x n → X ∗. ðàâíîìåðíî íà X ,òî Ðàâíîìåðíûé (ñèëüíûé) çàêîí áîëüøèõ ÷èñåë, supx∈X |F n (x) − F (x)| → 0 (ñ âåðîÿòíîñòüþ åäèíèöà), íå òî æå ñàìîå, ÷òî ïîòî÷å÷íûé çàêîí áîëüøèõ ÷èñåë: F n (x̄) → F (x̄) ñ âåðîÿòíîñòüþ 1 äëÿ ëþáîãî Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ x̄ ∈ X . Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî ïðèìåíåíèÿ Îáîáùåíèÿ x n → X ∗ íå îáÿçàòåëüíà F n (x) =⇒ F (x), äîñòîòî÷íî Äëÿ ñõîäèìîñòè ìèíèìóìîâ ðàâíîìåðíàÿ ñõîäèìîñòü (áîëåå ñëàáîé) ñõîäèìîñòè íàäãðàôèêîâ, epi-gphF n −→epi-gphF . Ãðàôèê (gph) è íàäãðàôèê (epi-gph) ôóíêöèè: gphF = {(x, F (x)) : x ∈ X } = {(x, z) : x ∈ X , z ≥ F (x)} epi-gphF (x) Ãðàôè÷åñêàÿ èëëþñòðàöèÿ (ñõîäèìîñòü íàäãðàôèêîâ) äëÿ ðàçðûâíûõ ôóíêöèé Ýïè-ãðàôè÷åñêèé çàêîí áîëüøèõ ÷èñåë (Âåòñ è äð. (1988, 1990, 1993, 1996) epi-gph 1 n Pn i=1 F (·, yi ) −→ epi-gphEy F (·, y ) (ñõîäèìîñòü ìíîæåñòâ) Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî ïðèìåíåíèÿ Çàäà÷è ïðèíÿòèÿ ðåøåíèé: Äàëüíåéøèå îáîáùåíèÿ Çàäà÷è ñ îãðàíè÷åíèÿìè â âèäå âêëþ÷åíèé: F (x ∗ (y ), y ) = min{x∈X :0∈S(x,y )} F (x, y ) ×àñòíûå ñëó÷àè: x ∈ X (y ) èëè 0 ∈ S(x, y ) (îáîáùåííîå óðàâíåíèå) F̂ (x ∗ (y )) = min{(x,z):x+z∈X (y )} [f (x, z) = F (x) + Ey λkzk] (çàäà÷à äâóõýòàïíîãî ïðîãðàììèðîâàíèÿ) Ñòîõàñòè÷åñêèå ïîñòàíîâêè: 0 ∈ Eξ S(x, ξ) (ñòîõàñòè÷åñêîå F (x ∗ ) = min{x:0∈Eξ S(x,ξ)} F (x) Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ âêëþ÷åíèå) Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî ïðèìåíåíèÿ