Register for your free account! | Forgot your password?

Go Back   elitepvpers > Coders Den > Java
You last visited: Today at 11:56

  • Please register to post and access all features, it's quick, easy and FREE!

Advertisement



Reinforcement Learning and TicTacToe not working?!

Discussion on Reinforcement Learning and TicTacToe not working?! within the Java forum part of the Coders Den category.

Reply
 
Old   #1
 
Shadow992's Avatar
 
elite*gold: 77
Join Date: May 2008
Posts: 5,430
Received Thanks: 5,878
Reinforcement Learning and TicTacToe not working?!

Hello guys,

at the moment I am getting deeper into "Reinforcement Learning"-Topic.
As Ialready know basics of AI-Programming I started with Q-Learning and Neural Nets in AutoIt (just because i wanted to show whats possible with AutoIt). This did not work, so i decided to just get a really basic implementation of Q-Learning with State/Action-Table in AutoIt this also did not work.

I thought it may be some AuotIt specific problems, so I decided to implement an even more basic example in Java and tried again with similiar core code like the AutoIt Version.

In Java I wrote a little helper class which is called "QLearner" and is not much more than a Hashtable with some extended features. QLearner for itself works like a charm (no problem with adding/getting values or finding biggest value for a given state/action pair).

Then I wrote a basic class for TicTacToe board, this is working too.

After that I started implementing a class called "AiPlayer" and "Main".
Main-Class isnt even that complex so there should be no problems too.

AiPlayer itself (except learning) works great too. So it always takes the action with biggest QValue, etc.
But learning itself seems to fail.
I used for learning just the simple approach:

Remember a maximum of X moves.
After each move update the Q-Value of move before current move with reward=0.

If the game ends, do the same as above, so:
After each move update the Q-Value of move before current move with reward = 100 (if won), reward = 0 (if draw), reward= -100 (if lost).

This is so far the theory and it seemed logic and legit to me. But something (maybe a Code-Bug or a Brain-Bug) makes it failing.

The 2 important classes are the following (learning is done in method train() of AiPlayer-Class):

Main.java

AiPlayer.java

I really hope you can help me. I am sure something went wrong with Q-Value calculation or saving old states (even if saving old states itself seems to work).

The learning AI seems not to learn at all (maybe 2-4% but not much more).
I let him do millions of games but learning AI is still loosing too often (my Output):


Thanks in advance.

Edit:
If someone wants to understand what I am trying here, have some looks at:


Edit:
I really found the problem, there were 3 problems:
1. Finding 3 in a row will take a long time if you are not doing simple checking.
2. My Q-Value calculation were wrong.
3. Loose/Draw/Win reward wasnt perfect

Fixes:
1. Just check also before each move if you can win game with setting next marker
2. Fixed in Code (coming later)
3. Loose-Reward=0, Draw-Reward=0.5, Win-Reward=1


New Code (only AiPlayer.java were changed):

Edit2:
After some more testing i found out that the reward function before (Win=1, Draw=0, Loose=-1) worked as well, maybe even little better. Thats why I changed the edited code again.
Shadow992 is offline  
Old 02/25/2015, 14:47   #2
 
XxharCs's Avatar
 
elite*gold: 34
Join Date: Apr 2011
Posts: 1,475
Received Thanks: 1,228
I would like to help you more, but I didn't work with Reinforcement Learning :/ (thought it's an iteresting topic )

But I have found one nice example which works with TicTacToe Reinforcement Learning and NN


Maybe this will help
XxharCs is offline  
Thanks
1 User
Old 02/25/2015, 15:20   #3
 
Shadow992's Avatar
 
elite*gold: 77
Join Date: May 2008
Posts: 5,430
Received Thanks: 5,878
Quote:
Originally Posted by XxharCs View Post
I would like to help you more, but I didn't work with Reinforcement Learning :/ (thought it's an iteresting topic )

But I have found one nice example which works with TicTacToe Reinforcement Learning and NN


Maybe this will help
Thanks for trying to help me but my code was inspired by this example.
At first I tried to implement this code in AutoIt but i thought the code does not work because of different NN-Libs or because of AutoIt-Problems itself.

But because I just wanted to understand Q-Learning itself i started a basic approach in Java, which also failed (thats quite frustrating ).

Maybe someone knows more about Q-Learning an can help me.
Is it possible that Q-Learning cannot learn "good" by playing against Random-Moves-Enemies?
I do not think so but did even not try it yet.

Hopefully someone can help me before i have to try all possible "Problems".
Shadow992 is offline  
Thanks
2 Users
Old 02/26/2015, 18:22   #4
 
elite*gold: 22
Join Date: Feb 2012
Posts: 576
Received Thanks: 332
@edit englisch mit dem handy geht gar nicht klar, verbessere ich wenn ich daheim bin
VisionEP1 is offline  
Reply


Similar Threads Similar Threads
TicTacToe mit Quelltext :)
09/21/2013 - Coding Releases - 8 Replies
Ich wollte euch teilhaben lassen an meinem viel zu kompliziert geschriebenem Tic Tac Toe. Ihr könnt mir ja bewertungen in den Kommentaren lassen. Bitte schreibt nicht, dass der code viel zu aufwändig ist! Das haben mir schon genug Leute gesagt :) Geschrieben mit VB.net
COD4 CLAN: TicTacToe
03/27/2013 - General Gaming Discussion - 0 Replies
Hallo Community, heute möchte ich unseren frischen Clan TTT (TicTacToe) vorstellen. Wir spielen generell nur COD4 auf der Version 1.7 und damit auch Promod. Wir sind jetzt gerade mal 3 Member, sozusagen seehr sehr frisch! Wenn ihr uns gerne joinen möchtet, dann müsst ihr folgende Anforderungen "erreichen": - Mindestens 15 sein. - Ein Skill von Med/Med+ haben. - TeamSpeak habe/Headset haben. - Vertrauensvoll COD4 benutzen, somit ohne Hacks/Aimhilfe etc. spielen
>>Selbstprogrammiertes TicTacToe<<
10/01/2012 - elite*gold Trading - 4 Replies
Ich verkaufe hier mein selbstprogrammiertes TicTacToe. Damit kann man auf den PC Tic Tac Tow spielen. Für nur 1e*gold! Bitte keine Kommentare wie : -Ist leicht zu programmieren -Kann man auch online kostenlos spielen
TicTacToe by HerrErdnuss :D
05/23/2011 - Main - 3 Replies
Liebe Community! Ich hatte mich mal vorhin drangesetzt und versucht ein simples TicTacToe in der Sprache Pascal zu schreiben : Was es schon hat : - 2Player Modus -> weiß wann man gewonnen hat ! - 1Player Modus
Codequalität von meinem TicTacToe
06/10/2010 - General Coding - 23 Replies
Hier mein kleines TicTacToe das ich gestern Nacht entworfen habe. Es wäre eine Frechheit die Computerzüge als KI zu bezeichnen aber man hat wenigstens einen Gegenspieler. (Eine halbwegs intelligente KI wollt ich demnächst noch entwerfen) Was für TicTacToe ja eigentlich gut machbar sein sollte. Es geht mir um meinen Code ich bin kein erfahrener programmiere. Ich öffne halt ab und zu meine IDE und schreib n paar Zeilen. Daher wollt ich fragen wie eigentlich die qualität vom code ist....



All times are GMT +1. The time now is 11:58.


Powered by vBulletin®
Copyright ©2000 - 2026, Jelsoft Enterprises Ltd.
SEO by vBSEO ©2011, Crawlability, Inc.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Support | Contact Us | FAQ | Advertising | Privacy Policy | Terms of Service | Abuse
Copyright ©2026 elitepvpers All Rights Reserved.